HCL hcl-wiki https://hcl.ucd.ie/wiki/index.php/Main_Page MediaWiki 1.27.1 first-letter Media Special Talk User User talk HCL HCL talk File File talk MediaWiki MediaWiki talk Template Template talk Help Help talk Category Category talk Main Page 0 1 1 2009-12-06T21:37:51Z MediaWiki default 0 wikitext text/x-wiki <big>'''MediaWiki has been successfully installed.'''</big> Consult the [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] for information on using the wiki software. == Getting started == * [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list] * [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ] * [http://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list] 08fc850f2898611c250d639e30f69532b5a016f8 2 1 2009-12-11T10:15:21Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[Autotools]] * [[GDB]] * [[Valgrind]] * [[Doxygen]] * [[ChangeLog]] * [[Eclipse]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 32236c19b1aff7cf3a7cae9f0ee933fbf305aebe 4 2 2009-12-11T10:23:48Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[Autotools]] * [[GDB]] * [[Valgrind]] * [[Doxygen]] * [[ChangeLog]] * [[Eclipse]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] b9f959db6d14dfe5e154ce0cb7cc97b6c91232b7 10 4 2009-12-11T12:42:26Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[Autotools]] * [[GDB]] * [[Valgrind]] * [[Doxygen]] * [[ChangeLog]] * [[Eclipse]] == Libraries == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] == Data processing == * [[gnuplot]] * [[dot]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 9f57fd129084d542ca77083a2f3d0d502c4a56eb 26 10 2009-12-11T14:49:34Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[Autotools]] * [[GDB]] * [[OProfile]] * [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Eclipse]] == Libraries == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] == Data processing == * [[gnuplot]] * [[dot]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] ecdf1d23d80f4c89249182b7567eddb9f3a7b681 29 26 2009-12-11T14:51:33Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]] * [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Eclipse]] == Libraries == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] == Data processing == * [[gnuplot]] * [[dot]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] b432f1d47f8634defbdaede64f2e8c1af0a737e3 32 29 2009-12-11T14:58:25Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]] * [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Eclipse]] == Libraries == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[dot]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 003791910996dcf0eff03f6f48e4f74357ce7d62 34 32 2009-12-11T15:02:09Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]] * [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Eclipse]] == Libraries == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 2356cebb8749c79c1e553a8f9543934db4fa12de 36 34 2009-12-11T15:03:45Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]] * [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] f7bc573057c6b5fa69cc7372763bb769b63efc5b 40 36 2009-12-11T15:29:37Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 752b8ad4c7757d8a2664f81a3998f17624811741 Help:Editing 12 2 3 2009-12-11T10:20:40Z Root 1 New page: http://www.mediawiki.org/wiki/Help:Formatting wikitext text/x-wiki http://www.mediawiki.org/wiki/Help:Formatting 8a67b7a93ee99837f982f9000b4f2e5200980279 7 3 2009-12-11T11:52:41Z Root 1 wikitext text/x-wiki * http://www.mediawiki.org/wiki/Help:Formatting * http://www.mediawiki.org/wiki/Help:Contents 7ece98643ff45907187bc7436ec41ed3d4ae1ae2 Linux 0 3 5 2009-12-11T11:48:33Z Root 1 New page: * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. wikitext text/x-wiki * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. eafad935559cd210d66324512267983cdc78599b 6 5 2009-12-11T11:49:37Z Root 1 wikitext text/x-wiki == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. 89319ade6c65edd63b0949f15f293f1e0dafe9b3 8 6 2009-12-11T12:05:42Z Root 1 wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings === Recommmnded settings === <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH # GCC CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. 963bae47cddf2d088f19d1cf2d44f04edea26064 37 8 2009-12-11T15:05:51Z Root 1 wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings === Recommmnded settings === <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. b9ff1a386fdca27ba8ead5ea7524f72cb3dd2823 Windows 0 4 9 2009-12-11T12:36:31Z Root 1 New page: == SSH clients == * putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ * cygwin http://www.cygwin.com/ (includes X windows and can be used with X11 forwarding) wikitext text/x-wiki == SSH clients == * putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ * cygwin http://www.cygwin.com/ (includes X windows and can be used with X11 forwarding) 07f67a96847c750447624145a72d7417a8981a54 HCL cluster 0 5 11 2009-12-11T13:56:54Z Root 1 New page: # Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> # On hcl09 and hcl10, <code>mkdi... wikitext text/x-wiki # Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> # On hcl09 and hcl10, <code>mkdir $ARCH</code>. The 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. # Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> 60bdd41e40418c4ad2f0a687238376be4761a6ad 12 11 2009-12-11T13:57:11Z Root 1 wikitext text/x-wiki # Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> # On hcl09 and hcl10, <code>mkdir $ARCH</code>. The 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. # Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> daacd1c760bd0290d38cc2044836b2e72aa53d91 13 12 2009-12-11T13:59:03Z Root 1 wikitext text/x-wiki * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> 94d8373e254075a0eec41c73a6adb37b0f917842 14 13 2009-12-11T13:59:32Z Root 1 wikitext text/x-wiki http://hcl.ucd.ie/Hardware * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> 73b3de4e07fd606080cfeaa283808b143dd89dc1 Grid5000 0 6 15 2009-12-11T14:00:51Z Root 1 New page: https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home ced54b0c1b655a3b7df3ffc56d49d927b3687a15 Dia 0 8 17 2009-12-11T14:04:26Z Root 1 New page: http://live.gnome.org/Dia/ http://dia-installer.de/index_en.html (for Windows) wikitext text/x-wiki http://live.gnome.org/Dia/ http://dia-installer.de/index_en.html (for Windows) ea2ad6937b32781f24d199a679d032c3efddd832 Eclipse 0 9 18 2009-12-11T14:20:00Z Root 1 New page: http://www.eclipse.org * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * wikitext text/x-wiki http://www.eclipse.org * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * 1775a3d3e6c6d660f11c2aa28cacb14448bcf5d2 19 18 2009-12-11T14:25:25Z Root 1 wikitext text/x-wiki http://www.eclipse.org * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] ([[Autotools]], [[Valgrind]]) * [[Eclox]] ([[Doxygen]]) * [[Subversive]] or [[Subclipse]] ([[Subversion]]) 9d9170536b191f4aa0314cff4e28335fa965d096 27 19 2009-12-11T14:49:56Z Root 1 wikitext text/x-wiki http://www.eclipse.org * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] 06fb52b7645a7f3f961b653cb25782202832ce60 52 27 2009-12-16T17:09:56Z Davepc 2 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] 5ee2f9d7dec12ce7bdfffc3de579c2478691da47 53 52 2009-12-16T17:10:17Z Davepc 2 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ + Sun Java 6 * Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] 9ac4e78e41c6967b8b34e0406d3e81f4d615d40e 54 53 2009-12-16T17:12:09Z Davepc 2 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] a197fed8916683cf86c0d8c1a55e06c112d0be6d Eclox 0 11 21 2009-12-11T14:28:11Z Root 1 New page: [[Doxygen]] for [[Eclipse]] http://eclox.eu/ Eclipse install http://download.gna.org/eclox/update/site.xml wikitext text/x-wiki [[Doxygen]] for [[Eclipse]] http://eclox.eu/ Eclipse install http://download.gna.org/eclox/update/site.xml 454b380a79679defeaa4f7f6f200e127fcdbf55f Subversive 0 12 23 2009-12-11T14:30:03Z Root 1 New page: [[Subversion]] for [[Eclipse]] http://www.eclipse.org/subversive/ Eclipse install http://download.eclipse.org/technology/subversive/0.7/update-site/ + client selection at the first run wikitext text/x-wiki [[Subversion]] for [[Eclipse]] http://www.eclipse.org/subversive/ Eclipse install http://download.eclipse.org/technology/subversive/0.7/update-site/ + client selection at the first run 1dfc0b7b48f2fbc43fe10b2201f1725174c4255b 24 23 2009-12-11T14:41:18Z Root 1 wikitext text/x-wiki [[Subversion]] for [[Eclipse]] http://www.eclipse.org/subversive/ Eclipse install http://download.eclipse.org/technology/subversive/0.7/update-site/ + select a client at the first run 6d97bb96bc0103426e7b47fcb489bddfa80502c1 Subclipse 0 13 25 2009-12-11T14:43:57Z Root 1 New page: [[Subversion]] for [[Eclipse]] http://subclipse.tigris.org/ Eclipse install http://subclipse.tigris.org/update_1.4.x (for the Subversion 1.5 client) or http://subclipse.tigris.org/update_... wikitext text/x-wiki [[Subversion]] for [[Eclipse]] http://subclipse.tigris.org/ Eclipse install http://subclipse.tigris.org/update_1.4.x (for the Subversion 1.5 client) or http://subclipse.tigris.org/update_1.6.x (for the Subversion 1.6 client) '''There is a bug, when it runs under Linux''' 3e3a63fa5f3e56eab96d4302f0c70d797aa7b865 C/C++ 0 14 30 2009-12-11T14:57:19Z Root 1 New page: * One-true-brace ident style is preferrable (http://en.wikipedia.org/wiki/Indent_style) * Header files http://en.wikipedia.org/wiki/Pragma_once * Mixing C/C++ http://developers.sun.com/sol... wikitext text/x-wiki * One-true-brace ident style is preferrable (http://en.wikipedia.org/wiki/Indent_style) * Header files http://en.wikipedia.org/wiki/Pragma_once * Mixing C/C++ http://developers.sun.com/solaris/articles/mixing.html * Don't use non-standatd functions, like [http://en.wikipedia.org/wiki/Itoa|itoa] 86358e4e9f48389acf761c3dcd2805c440d59ec2 31 30 2009-12-11T14:57:29Z Root 1 wikitext text/x-wiki * One-true-brace ident style is preferrable (http://en.wikipedia.org/wiki/Indent_style) * Header files http://en.wikipedia.org/wiki/Pragma_once * Mixing C/C++ http://developers.sun.com/solaris/articles/mixing.html * Don't use non-standatd functions, like [http://en.wikipedia.org/wiki/Itoa itoa] 6168094587cc4799aed830a19867910bff94f8ac BLAS LAPACK ScaLAPACK 0 15 33 2009-12-11T15:01:15Z Root 1 New page: * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ (Fortran) * ATLAS http://math-atlas.sourceforge.net/ (C, LAPACK is implemented partially) * MKL http://software.intel.c... wikitext text/x-wiki * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ (Fortran) * ATLAS http://math-atlas.sourceforge.net/ (C, LAPACK is implemented partially) * MKL http://software.intel.com/en-us/intel-mkl/ BLAS: overview, installation, usage - http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf a82754bc90781196e1cd5bbf5702d42df3776249 Doxygen 0 16 35 2009-12-11T15:03:08Z Root 1 New page: http://www.stack.nl/~dimitri/doxygen/ * Qt style is preferrable wikitext text/x-wiki http://www.stack.nl/~dimitri/doxygen/ * Qt style is preferrable 78278c29d28cd5beb9deac41ac51f589bcc61ad3 Libraries 0 17 38 2009-12-11T15:08:42Z Root 1 New page: * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure with <code>--prefix=$HOME</code> * Update [[Linux]] environment for [[Autotools]] and [[C/C++... wikitext text/x-wiki * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure with <code>--prefix=$HOME</code> * Update [[Linux]] environment for [[Autotools]] and [[C/C++]] <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> c4dee10936e95c025f2f4dc5ec4d0d7b7bc4aaa7 ChangeLog 0 18 39 2009-12-11T15:10:55Z Root 1 New page: * either <code>svn log -v > ChangeLog</code> with [[Subversion]] * or http://en.wikipedia.org/wiki/Changelog with the [[Linuxtools]] plugin in [[Eclipse]] wikitext text/x-wiki * either <code>svn log -v > ChangeLog</code> with [[Subversion]] * or http://en.wikipedia.org/wiki/Changelog with the [[Linuxtools]] plugin in [[Eclipse]] e1e45e38508f1034073987bcc87b6c96fa5205c8 Subversion 0 19 41 2009-12-11T15:48:09Z Root 1 New page: http://svnbook.red-bean.com/ * Subversion clients work with <code>.svn</code> directories - don't remove them. * Mind the version of the client (currently, 1.5, 1.6). == Repositories == ... wikitext text/x-wiki http://svnbook.red-bean.com/ * Subversion clients work with <code>.svn</code> directories - don't remove them. * Mind the version of the client (currently, 1.5, 1.6). == Repositories == * http://hcl.ucd.ie/repos/project_name - read only * https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == * Software sources: models, code, resource files * Documentation sources: texts, diagrams, data * Configuration files * Test sourses: code, input data == Not to submin == * Binaries: object files, libraries, executables * Built documentation: html, pdf * Personal settings: Eclipse projects, ... * Test output 2d00e5a342ae878324ae39f203da576e63c22893 42 41 2009-12-11T15:48:21Z Root 1 wikitext text/x-wiki http://svnbook.red-bean.com/ * Subversion clients work with <code>.svn</code> directories - don't remove them. * Mind the version of the client (currently, 1.5, 1.6). == Repositories == * http://hcl.ucd.ie/repos/project_name - read only * https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == * Software sources: models, code, resource files * Documentation sources: texts, diagrams, data * Configuration files * Test sourses: code, input data == Not to submit == * Binaries: object files, libraries, executables * Built documentation: html, pdf * Personal settings: Eclipse projects, ... * Test output 0bbd6306241135321aa96a536fcb51291723c4d0 LaTeX 0 20 43 2009-12-11T15:49:56Z Root 1 New page: * Beamer - a package for presentation slides * Latex can be used in the [[Doxygen]] documentation, inlcuding formulas wikitext text/x-wiki * Beamer - a package for presentation slides * Latex can be used in the [[Doxygen]] documentation, inlcuding formulas 3242e013a70e8f22306387a036d2df190e46bb3d Autotools 0 21 44 2009-12-11T15:52:11Z Root 1 New page: http://en.wikipedia.org/wiki/Autoconf http://www.gnu.org/software/autoconf/manual/index.html http://www.gnu.org/software/automake/manual/index.html http://sources.redhat.com/autobook/au... wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://www.gnu.org/software/autoconf/manual/index.html http://www.gnu.org/software/automake/manual/index.html http://sources.redhat.com/autobook/autobook/autobook.html bfe9dab242c3019b3233380735c089a8188bfd43 Gnuplot 0 22 45 2009-12-11T15:56:54Z Root 1 New page: http://www.gnuplot.info/documentation.html http://t16web.lanl.gov/Kawano/gnuplot/index-e.html wikitext text/x-wiki http://www.gnuplot.info/documentation.html http://t16web.lanl.gov/Kawano/gnuplot/index-e.html 5fea26b7f043889d7d6d41a690dc4fafdfe4d7d1 Graphviz 0 23 46 2009-12-11T15:59:19Z Root 1 New page: * Used with [[Doxygen]] for class diagrams * Used with the [[Boost]] Graph for graph visualization wikitext text/x-wiki * Used with [[Doxygen]] for class diagrams * Used with the [[Boost]] Graph for graph visualization 014f14af28d7fbefc1b2a81a72d15ae6a4843721 47 46 2009-12-11T15:59:37Z Root 1 wikitext text/x-wiki * Can be used with [[Doxygen]] for class diagrams * Can be used with the [[Boost]] Graph for graph visualization 414cc67707bd009d39e829b0d81bebf722ec79b6 R 0 24 48 2009-12-11T16:00:35Z Root 1 New page: http://cran.r-project.org/ wikitext text/x-wiki http://cran.r-project.org/ 556bdff61f502e6b3ddf3de379d59f4426acb5d0 GSL 0 25 49 2009-12-11T16:01:19Z Root 1 New page: http://www.gnu.org/software/gsl/ wikitext text/x-wiki http://www.gnu.org/software/gsl/ c75d5a9b6a8c2aa1e8b3f904a66bb62ec5fe3a62 Boost 0 26 50 2009-12-11T16:01:38Z Root 1 New page: http://www.boost.org/ wikitext text/x-wiki http://www.boost.org/ 7504489c5ddcf94e84bb251d9cf7c35c1dbe54f8 HCL:About 4 27 51 2009-12-13T11:31:50Z Root 1 New page: UCD Heterogeneous Computing Laboratory (HCL) http://hcl.ucd.ie wikitext text/x-wiki UCD Heterogeneous Computing Laboratory (HCL) http://hcl.ucd.ie 4084d31aa8d9d7cfb11fb769e20cd1d109a4d9c9 Libraries 0 17 55 38 2010-01-18T14:59:25Z Root 1 wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure with <code>--prefix=$HOME</code> * Update [[Linux]] environment for [[Autotools]] and [[C/C++]] <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> 6ace9abcfbd147d77a8e5089621a016111892bd5 94 55 2010-01-19T09:29:36Z Root 1 wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure with <code>--prefix=$HOME</code> * Update [[Linux]] environment for [[Autotools]] and [[C/C++]] <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> 6a1785c9fda0e6194cf3f22ad361f0a48660ba74 Main Page 0 1 56 40 2010-01-18T15:24:14Z Kiril 3 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[LogP/MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 8d3a6985275ef4c280b5e2169b8a72563052caff 99 56 2010-01-19T10:33:45Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 52ed8adb23900676d209619b3de81db8f3029aba 105 99 2010-01-19T10:35:46Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[logp_mpi|The MPI LogP Benchmark]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 92d7ae59a14f47ac4e1da0fc1800b5092eae1a7b 106 105 2010-01-19T10:36:04Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] 4dbb25832665ea6dd5afe9a8df41b1d2cd7a5413 121 106 2010-01-29T12:27:47Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [[Statistics]] a0bd94cde52f9ba57aae6f01c42a9db224b6484f 125 121 2010-01-29T12:56:02Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used in order to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] e10c14814274f417f70900e72dbb6a83b6b7c51a 131 125 2010-01-29T13:11:59Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[R]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 48e4119a4b78c696052102e6533a80bcdc88c020 MPI 0 29 68 2010-01-18T16:07:21Z Kiril 3 New page: You can choose different MPIs. E.g. Open MPI: [[ http://www.open-mpi.org ]] wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[ http://www.open-mpi.org ]] eac9325ff5f48dd1513a1e3125d37b263871aea2 69 68 2010-01-18T16:08:29Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[http://www.open-mpi.org]] 5637ae8607f40ad3d176d36f61a5c76dc15bf5b3 70 69 2010-01-18T16:08:45Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[ http://www.open-mpi.org ]] eac9325ff5f48dd1513a1e3125d37b263871aea2 71 70 2010-01-18T16:09:01Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[ http://www.open-mpi.org ]] f2c90f14ccac9ee73172d00b09d8d338ff5674a0 72 71 2010-01-18T16:09:26Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[http://www.open-mpi.org ]] 640a1680dcfa4c3a942222359406469a0acfca78 73 72 2010-01-18T16:09:48Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[ http://www.open-mpi.org | http://www.open-mpi.org ]] feaf56449b5f0fe7837cab92edd5553879fdccb0 74 73 2010-01-18T16:10:06Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI: [[ Bttp://www.open-mpi.org | http://www.open-mpi.org ]] 8175769504c7f8d0def62fd1bf2b645830dd1f32 75 74 2010-01-18T16:10:31Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [[http://www.open-mpi.org | http://www.open-mpi.org ]] c649b3e048ec956a7fe048cc9eb53b7ce25a3109 76 75 2010-01-18T16:10:47Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [[ http://www.open-mpi.org ]] 97e8bfd571b0c6305de38e6ff3a1e1e37240527f 77 76 2010-01-18T16:11:12Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [[http://www.open-mpi.org | http://www.open-mpi.org ]] c649b3e048ec956a7fe048cc9eb53b7ce25a3109 78 77 2010-01-18T16:11:28Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [ http://www.open-mpi.org | http://www.open-mpi.org ] 7bd27596c18403977530b4b19066eaae4539f95d 79 78 2010-01-18T16:11:38Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [[ http://www.open-mpi.org | http://www.open-mpi.org ]] d331eff2704c8495f1740a72d3c3e14d088cd829 80 79 2010-01-18T16:11:52Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. Open MPI [[http://www.open-mpi.org | http://www.open-mpi.org ]] c649b3e048ec956a7fe048cc9eb53b7ce25a3109 81 80 2010-01-18T16:12:23Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. [[Open MPI | http://www.open-mpi.org ]] e2a7dc66f3c8460440994b209c5b3b99530c3049 82 81 2010-01-18T16:12:45Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. [[ http://www.open-mpi.org | Open MPI ]] 011045f0d95b4849aa30ee29292fbe9117699b8d 83 82 2010-01-18T16:13:00Z Kiril 3 wikitext text/x-wiki You can choose different MPIs. E.g. [[http://www.open-mpi.org | Open MPI ]] 3e7d903020eae5d1f441391bc529a9f7efe5aadc 84 83 2010-01-18T16:13:24Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [[ http://www.open-mpi.org | Open MPI ]] 8b8d2989d80afa196381f9078d5711332f4d46a7 85 84 2010-01-18T16:13:45Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [[ http://www.open-mpi.org | OpenMPI ]] 9f0bc0213cf4a674febd9db9204c91ca3dc056a1 86 85 2010-01-18T16:13:57Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [[ http://www.open-mpi.org ]] 1c85ec40db429320ca7602634872a9090842fb97 87 86 2010-01-18T16:14:15Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [[http://www.open-mpi.org]] 6c72a9bc7efd603618886e75d094b14044f8ffa1 88 87 2010-01-18T16:14:27Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [[ http://www.open-mpi.org ]] 1c85ec40db429320ca7602634872a9090842fb97 89 88 2010-01-18T16:14:43Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: [ [http://www.open-mpi.org] ] 3302a2b59342de8edce8dd867900aaab998ed29a 90 89 2010-01-18T16:15:05Z Kiril 3 wikitext text/x-wiki You can choose different MPIs: Open MPI: [http://www.open-mpi.org] 1d56e1e7a89bb28aa5233d607595929b9bae2cf1 Doxygen 0 16 107 35 2010-01-19T11:18:22Z Root 1 wikitext text/x-wiki http://www.stack.nl/~dimitri/doxygen/ * Requires [[Graphviz|dot]], [[Latex]] * Qt style is preferrable 9c495ae2948cbc63ec6afd334e29d2a72a75f979 119 107 2010-01-28T18:37:28Z Root 1 wikitext text/x-wiki http://www.stack.nl/~dimitri/doxygen/ * Requires [[Graphviz|dot]], [[Latex]] * Qt documenting style is preferrable d51d059132dac83a896619bc8a54a9c28d84a3d3 C/C++ 0 14 108 31 2010-01-22T14:58:43Z Root 1 wikitext text/x-wiki * One-true-brace ident style is preferrable (http://en.wikipedia.org/wiki/Indent_style) * Coding header files http://en.wikipedia.org/wiki/Pragma_once * Mixing C/C++ http://developers.sun.com/solaris/articles/mixing.html * Don't use non-standatd functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] a12303e0fd1c7fae56be0296a63db0327934d4c9 109 108 2010-01-26T15:26:35Z Root 1 wikitext text/x-wiki == Coding == * One-true-brace ident style is preferrable (http://en.wikipedia.org/wiki/Indent_style) * Coding header files http://en.wikipedia.org/wiki/Pragma_once == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Shared_library#Shared_libraries Shared libraries] and [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading] == C++ == * Mixing C/C++ http://developers.sun.com/solaris/articles/mixing.html * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] d03294272a835b56a16ab56c97a97145fdb25c01 110 109 2010-01-26T15:29:32Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Shared_library#Shared_libraries Shared libraries] and [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading] == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] 5f952f69c39294c71e130e9f8a3bc7cfd59ac96e 120 110 2010-01-28T18:45:26Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Shared_library#Shared_libraries Shared libraries] and [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading] 3075d12811ae32b7997e9a960201bab0284c1131 Eclipse 0 9 111 54 2010-01-26T15:53:45Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Preferences == * To avoid unresolved things, add them either in Eclipse (Project -> Properties -> C/C++ General -> Paths and Symbols) or in environment (see [[Libraries]], CPATH) 92dae59555da303e584189644e368b05ba435d94 112 111 2010-01-26T15:55:04Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved things, add them either in environment (see [[Libraries]], CPATH) or in Eclipse (Project -> Properties -> C/C++ General -> Paths and Symbols) 30d08bcac3329cf8aa54b2d627dbfe271869b343 113 112 2010-01-26T19:09:16Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved things, add them to Eclipse Project -> Properties -> C/C++ General -> Paths and Symbols 002ed9783d6f50c107b9b0807ef645ca935b0eed 114 113 2010-01-26T19:11:04Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved things, add them to either environment (see [[Libraries]]) or Eclipse Project -> Properties -> C/C++ General -> Paths and Symbols a47a8938d84123deacce63c50630c9f7db5dcb27 115 114 2010-01-26T19:11:19Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved things, add them to either environment (see [[Libraries]]) or Eclipse (Project -> Properties -> C/C++ General -> Paths and Symbols) a9ac112cec01ee00fc54041b60222e0b40b0650d 116 115 2010-01-26T19:12:42Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved things, add them to Project -> Properties -> C/C++ General -> Paths and Symbols 1a745ef3edb47ff12f5fefe127d6d0ff0c8afcdf 117 116 2010-01-26T19:16:14Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Settings == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes 0fec0edbb83bcba941e236977023c0af43a653bf LaTeX 0 20 118 43 2010-01-28T18:35:56Z Root 1 wikitext text/x-wiki * Beamer - a package for presentation slides * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc 3ea73e8f69fbc4acdc594e652bda18b5339913f5 R 0 24 126 48 2010-01-29T13:06:13Z Root 1 wikitext text/x-wiki http://cran.r-project.org/ Installation from sources: 1. R sould be configured as a shared library <source lang="bash"> $ ./configure --prefix=DIR --enable-R-shlib=yes $ make install </source> 2. Set up environment <source lang="bash"> $ export R_HOME=DIR/lib/R </source> 3. Install required packages <source lang="bash"> $ DIR/bin/R > install.packages(c(”sandwich”, ”strucchange”, ”zoo”)) </source> 4. If R is intalled in a non-default directory <source lang="bash"> $ export LD_LIBRARY_PATH=$R_HOME/lib:$LD_LIBRARY_PATH </source> f56d8694cea4f26972b587d74d361a1c9a859cce GSL 0 25 127 49 2010-01-29T13:07:03Z Root 1 wikitext text/x-wiki http://www.gnu.org/software/gsl/ If GSL is intalled in a non-default directory <source lang="bash"> $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH <source> 2362b8ca7694f9058a3c48913a88653bf776f483 128 127 2010-01-29T13:07:21Z Root 1 wikitext text/x-wiki http://www.gnu.org/software/gsl/ If GSL is intalled in a non-default directory <source lang="bash"> $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> 768b38d7b6858583a65c52e6dffb059eb517b028 Boost 0 26 129 50 2010-01-29T13:09:36Z Root 1 wikitext text/x-wiki http://www.boost.org/ Installation from sources: 1. Boost sould be configured with at least graph and serialization libraries (default: all) <source lang="bash"> $ ./configure --prefix=DIR --with-libraries=graph,serialization </source> 2. Default installation: - DIR/include/boost_version/boost - DIR/lib/libboost_library_versions.* Create symbolic links: <source lang="bash"> $ cd DIR/include; ln -s boost_version/boost $ cd DIR/lib; ln -s libboost_[library]_[version].[a/so] libboost_[library].[a/so] $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> 4bd459e8e47544890aac26beb9011d46ee852957 Graphviz 0 23 132 47 2010-01-29T13:13:40Z Root 1 wikitext text/x-wiki * [http://en.wikipedia.org/wiki/DOT_language DOT language] * Can be used with [[Doxygen]] for class diagrams * Can be used with the [[Boost]] Graph for graph visualization 4bbd01c4a57a63828f1be52f3b542aeb1c7c74e9 133 132 2010-01-29T13:18:13Z Root 1 wikitext text/x-wiki http://graphviz.org [http://en.wikipedia.org/wiki/DOT_language DOT language] * Can be used with [[Doxygen]] for class diagrams * Can be used with the [[Boost]] Graph for graph visualization 4736b9d14e981adf0f76675481a4d22e775d1fd2 Subversive 0 12 134 24 2010-01-29T13:40:37Z Root 1 wikitext text/x-wiki [[Subversion]] for [[Eclipse]] http://www.eclipse.org/subversive/ Eclipse install http://download.eclipse.org/technology/subversive/0.7/update-site/ + select a client at the first run To work with your code from both command line and Eclipse, the versions of [[Subversion]] and Subversive Connector must be the same. a75cedfb7ad4ad45344acc4144084e66128f847b 135 134 2010-01-29T13:41:01Z Root 1 wikitext text/x-wiki [[Subversion]] for [[Eclipse]] http://www.eclipse.org/subversive/ Eclipse install http://download.eclipse.org/technology/subversive/0.7/update-site/ + select a client at the first run To work with your code from both command line and [[Eclipse]], the versions of [[Subversion]] and Subversive Connector must be the same. 79032515aa8daa611ab9bda0ef860b28369991a9 C/C++ 0 14 136 120 2010-01-29T17:09:50Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 006a64c333508635ee62fb3344a022352ae33132 138 136 2010-02-04T17:50:41Z Root 1 /* Coding */ wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use techniques from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 4d8b21dac4ec65679e5acd84c39894f4afa68ab4 139 138 2010-02-04T17:51:28Z Root 1 /* Coding */ wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 67ed171f84c4c069c0b7d422c4373a53782c2c4a 142 139 2010-02-05T18:54:25Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes) <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a precompiled header (including common headers), which is included in most of source files of the library 62e147be29e11dee4910e3d054338d5420456619 143 142 2010-02-05T18:54:43Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes) <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a precompiled header (contains common headers), which is included in most of source files of the library 383db7549609788caad5a4d0119bed7b8a2f90e1 144 143 2010-02-05T18:54:56Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes) <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a precompiled header (contains common headers and symbols), which is included in most of source files of the library 500e604e7f65b8f05a8ab6117adc655ece061cb6 145 144 2010-02-05T19:24:31Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes): <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a precompiled header (contains common headers and symbols), which is included in most of source files of the library ee2ce4849da3d41119974d7124652173285005d3 146 145 2010-02-05T19:34:08Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes): <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library 40fe872a1fab9cf98ef798677baf80cfa927e068 149 146 2010-02-10T14:08:33Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or STL/Boost functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes): <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library 0c7ece45004fc99f8c4bc5c5de215d42eb586e05 151 149 2010-02-10T14:09:26Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes): <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library da27a6177b7ca71f332ed09331c2c2ad40fc4ccf 154 151 2010-02-12T12:35:52Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** sources (internal C data structures and C++ template classes): <code>library_SOURCES = library.h ...</code> *** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library 78dc26d2102128b0e3864300ffeac1e60aa5ecb9 170 154 2010-02-16T16:01:35Z Root 1 /* Creating libraries with Autotools */ wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * Prepare two sets of headers: ** includes (for the include directory): <code>include_HEADERS = ...</code> ** define static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> ** sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> *** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library ee23ff1a871beebf1a10caedd1edeabd9db76ef5 171 170 2010-02-16T16:05:57Z Root 1 /* Creating libraries with Autotools */ wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] == Creating libraries with [[Autotools]] == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, see: * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am a58d390f71dbfd088b36884ba30d4564243d7262 173 171 2010-02-16T16:09:05Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 794e2f0f921514659171a1f4e2afea7304cd6439 MPI 0 29 137 90 2010-02-01T14:50:47Z Root 1 wikitext text/x-wiki You can choose different MPIs: * Open MPI: [http://www.open-mpi.org] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. e436d0341ed5265e5f534da0e77df2c4e6d24386 Eclipse 0 9 140 117 2010-02-05T17:15:55Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <source lang="C">// TODO: ...</source> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 42cca63f7e324099b162b3d2d725009ddfa07637 141 140 2010-02-05T17:16:13Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) ee890b59e36b10edc5258ad7e76cf333bf09dcaf Main Page 0 1 147 131 2010-02-08T17:00:58Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 502fff18a232c61ab1c6ba8dcf4c7ab0644d61d9 150 147 2010-02-10T14:09:06Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 3eedc342220ac466dee32fb840c7f6cfac5d2244 155 150 2010-02-15T13:10:55Z Root 1 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == [[Clusters]] == * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 1234e144a4fb7e7c3a1104e97e2d45d56864aa4f 159 155 2010-02-15T14:27:02Z Kiril 3 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[Passwordless SSH]] * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 013e67b45a41c38fc87f050e1fce0d9c1043b9d2 169 159 2010-02-15T17:49:03Z Kiril 3 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH utilization]] * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees a5a25adc96977c2fac38ecd9a3826930bf960894 Autotools 0 21 148 44 2010-02-09T17:42:33Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html Manuals * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html 4e76c31bbbf7019f55f79fff1e5ac3d542618658 172 148 2010-02-16T16:08:50Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, see: * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Config headers == f63d757f12ba568074788288f410364baf4d210a 174 172 2010-02-16T16:16:00Z Root 1 /* Config headers */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, see: * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> for the configured headers as sources 63c428174ea6210e83bd8ebb2e9d18a2838b1701 175 174 2010-02-16T16:16:43Z Root 1 /* Configured headers */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, see: * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am * http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am 66ccbaa9f115f35f004a65096ae8f68ec93390fd 176 175 2010-02-16T16:17:34Z Root 1 /* Libraries */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am d8dfaecffde8632f0f0d4733ba10b1143b87c87b 177 176 2010-02-16T16:19:13Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 07994a4ef31efec886cb3172de92450805cdf3f1 178 177 2010-02-16T16:19:28Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 19c1e8e2a1f57e62814b672d671bba73afea9f25 179 178 2010-02-16T17:00:34Z Root 1 /* Configured headers */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILD_SOURCES = *.h</code>for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 34431470d1e2c04d5e52ab3bcae25beb92c625eb 180 179 2010-02-16T17:00:43Z Root 1 /* Configured headers */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILD_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 6fcf5d1c963ac1d7addee49261236f0a64dddd03 181 180 2010-02-16T17:01:00Z Root 1 /* Configured headers */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 2b55e97dbe671e388ae0b45cd9da3450cbf4c333 STL 0 34 152 2010-02-10T14:12:24Z Root 1 New page: http://www.sgi.com/tech/stl/ http://en.wikipedia.org/wiki/Standard_Template_Library [http://softwareramblings.com/2009/06/update-stl-vs-gnulib-performance.html STL run-time performance] wikitext text/x-wiki http://www.sgi.com/tech/stl/ http://en.wikipedia.org/wiki/Standard_Template_Library [http://softwareramblings.com/2009/06/update-stl-vs-gnulib-performance.html STL run-time performance] 6d931a04c2d69690af185ec49419623bb0a5ed19 153 152 2010-02-10T14:13:45Z Root 1 wikitext text/x-wiki http://www.sgi.com/tech/stl/ http://en.wikipedia.org/wiki/Standard_Template_Library == Performance == * [http://www.tantalon.com/pete/gdc2001roundtablereport.htm Optimization Techniques] * [http://softwareramblings.com/2009/06/update-stl-vs-gnulib-performance.html Experimental results] d383ee2ab26dfa136aba4f09a4062e798cd66a4c Clusters 0 35 156 2010-02-15T13:11:24Z Root 1 New page: == Passwordless SSH == wikitext text/x-wiki == Passwordless SSH == a0a3c9a1b87841721cb4875b26c7b8d5c6cd9ecd 157 156 2010-02-15T14:23:43Z Kiril 3 /* Passwordless SSH */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of keys * copy a public key from the source computer to the target computer's authorized_keys file * check the permissions. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html b64c6ff76b485884d7419ce3bd5efb5de651ae9b 158 157 2010-02-15T14:25:04Z Kiril 3 /* Passwordless SSH */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html 30f9edca4ed29eb225b432d5921747901638278b SSH 0 36 160 2010-02-15T14:27:11Z Kiril 3 New page: To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to th... wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html 957a767c79201bb54857ee9d6c8b0f970cafe4b0 161 160 2010-02-15T17:36:08Z Kiril 3 wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making life with SSH on HCL easier == Here is how you can set up the access to any machine as a Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p b7870c153cfa7fdc8d0a0f898a2bb294c4a0ae4d 162 161 2010-02-15T17:38:10Z Kiril 3 /* Making life with SSH on HCL easier */ wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making life with SSH on HCL easier == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p 58c96b8dee22568b9909f33f800ac66ff28224d3 163 162 2010-02-15T17:38:54Z Kiril 3 /* Making life with SSH on HCL easier */ wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making life with SSH on HCL easier == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 86fb86dc44de85cfe377e1fcc42110cb57da933b 164 163 2010-02-15T17:39:14Z Kiril 3 /* Making life with SSH on HCL easier */ wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making life with SSH on HCL easier == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 29bdb1cd777a61d43a8334fd26e19aaad19eeaae 166 164 2010-02-15T17:47:55Z Kiril 3 [[Passwordless SSH]] moved to [[SSH utilization]]: two topics in one main topic wikitext text/x-wiki To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making life with SSH on HCL easier == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 29bdb1cd777a61d43a8334fd26e19aaad19eeaae 168 166 2010-02-15T17:48:41Z Kiril 3 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. eabfbca460fb11529dd02328b296b744efdd8ad3 HCL cluster 0 5 165 14 2010-02-15T17:45:09Z Root 1 wikitext text/x-wiki http://hcl.ucd.ie/Hardware == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> 3f0c6763e8b02014e05dc8264d74f5e79cbaf866 Passwordless SSH 0 37 167 2010-02-15T17:47:55Z Kiril 3 [[Passwordless SSH]] moved to [[SSH utilization]]: two topics in one main topic wikitext text/x-wiki #REDIRECT [[SSH utilization]] cef558ebffd3f19a62079f8f6b73a3d8af9e2252 Autotools 0 21 182 181 2010-02-16T17:07:06Z Root 1 /* Libraries */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sources.redhat.com/autobook/autobook/autobook.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am 5e477fce75c2445948472b0592ce8b622e112f34 193 182 2010-02-23T09:26:07Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am b0138faef74ac5e7ce091e053fa6623a299112a8 211 193 2010-03-10T17:44:00Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == http://sourceware.org/autobook/autobook/autobook_179.html 2d83de5fee8a52a7aa70cef0329f6add5de3dfab 212 211 2010-03-10T17:45:43Z Root 1 /* Conditional building */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * http://sourceware.org/autobook/autobook/autobook_179.html 2f054fd84b2b80d02399577bcb03c5b07b3f8fe6 213 212 2010-03-10T17:46:28Z Root 1 /* Conditional building */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html 0f018111b6da624725117743781101677823e8e4 214 213 2010-03-10T17:47:39Z Root 1 /* Conditional building */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> 920465463603846d3646df1dd2e4e6c8f7f91463 222 214 2010-03-16T11:25:18Z Kiril 3 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == Script for downloading & installing recent versions of m4, libtool, autoconf, automake== fc2150c09d780bb16bf2d264dec2c4af96d73208 223 222 2010-03-16T11:25:53Z Kiril 3 /* Script for downloading & installing recent versions of m4, libtool, autoconf, automake */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == Script for downloading & installing recent versions of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 3f9af7b06883f2bc30003ddf8da0153b560c0802 224 223 2010-03-16T11:26:56Z Kiril 3 /* Script for downloading & installing recent versions of m4, libtool, autoconf, automake */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 4ba3fcd719e0027864f8d6fe5812dfdbed062158 225 224 2010-03-16T11:27:39Z Kiril 3 /* Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD export PATH=$HOME/$ARCH/bin:$PATH wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir d86c03a8131d5913a58b621d99d54e4128b888ff Boost 0 26 183 129 2010-02-17T15:17:34Z Root 1 wikitext text/x-wiki http://www.boost.org/ == Installation from sources == 1. By default, boost is configured with all libraries. To save time on building boost, you can configure it ony with the libraries you need: <source lang="bash"> $ ./configure --prefix=DIR --with-libraries=graph,serialization </source> 2. Default installation: - DIR/include/boost_version/boost - DIR/lib/libboost_library_versions.* Create symbolic links: <source lang="bash"> $ cd DIR/include; ln -s boost_version/boost $ cd DIR/lib; ln -s libboost_[library]_[version].[a/so] libboost_[library].[a/so] $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> == Documentation == * [http://www.boost.org/doc/libs/1_42_0/libs/graph/doc/table_of_contents.html Graph] * [http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html Serialization] 9682b9949b15a10a14ea2edf77e02276e6b6d855 SSH 0 36 184 168 2010-02-22T12:10:12Z Root 1 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 6d3bb151af99423db1a80d31c95f616dc8359e19 185 184 2010-02-22T12:10:45Z Root 1 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 8d02f5b9f1cb247dbfca31d4307cab3b9a8e7fcf 186 185 2010-02-22T12:12:21Z Root 1 [[SSH utilization]] moved to [[SSH]] wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. 8d02f5b9f1cb247dbfca31d4307cab3b9a8e7fcf 189 186 2010-02-22T12:13:32Z Root 1 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. == X11 forwarding == <code lang="bash"> ssh -X </code> 6fad1be7bd377a6636ace9914f194615e9c88dcd 190 189 2010-02-22T12:13:48Z Root 1 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Now, you can do: ssh hcl01 and you get logged to the node immediately. == X11 forwarding == <code lang="bash"> ssh -X hostname </code> f5dd3586dd1c6bc72ffc6c6f26a0c3ea45ed2e95 SSH utilization 0 38 187 2010-02-22T12:12:21Z Root 1 [[SSH utilization]] moved to [[SSH]] wikitext text/x-wiki #REDIRECT [[SSH]] 6e770ad09891112e04f8e71950a8ac7bdb6cbcea Main Page 0 1 188 169 2010-02-22T12:12:32Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees a7a245fb3fafb169e72126f1ccaf8075224ee81c 206 188 2010-03-03T18:06:00Z Rhiggins 4 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[UTK clusters]] * [[Grid5000]] * [[Other UCD Resources]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 695e2abef84cb5cb25b835924be9c8c6d5ba5cb9 208 206 2010-03-04T09:09:45Z Root 1 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 3421e0e108b63743d6304f1e39e3182b6520c417 215 208 2010-03-15T17:07:21Z Kiril 3 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[Putty]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees f680f652bc28249927b7bdce2d10565796c92e99 230 215 2010-03-16T15:58:21Z Kiril 3 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]] * [[Autotools]] * [[GDB]] * [[OProfile]], [[Valgrind]] * [[ChangeLog]] * [[Doxygen]] * [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 3421e0e108b63743d6304f1e39e3182b6520c417 LaTeX 0 20 191 118 2010-02-22T17:25:31Z Kiril 3 wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a Beamer package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc dcfdfa1ed5d22b657109589b7febdac725c2eb20 195 191 2010-02-23T10:45:19Z Root 1 wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + TeXlipse plugin 7c546fd6b7231d4ee42bbaf0854148df2f818b57 199 195 2010-02-23T10:47:26Z Root 1 wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] 3fba5b432c1901f8f68657a996135e48ea579270 C/C++ 0 14 192 173 2010-02-23T08:53:07Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * Provide main API in C == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 9f4b3a8583f0b94db9842e484c54bfe4d157303e 209 192 2010-03-05T18:37:25Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] a9cd2dec7dcf6117f25f8f01e2f36513af5ed5b3 231 209 2010-03-17T04:54:19Z Root 1 wikitext text/x-wiki == Coding == * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] cc53c4a031627ab5d043daa3b6232d9d6a544876 Octave 0 39 194 2010-02-23T09:26:41Z Root 1 New page: http://www.gnu.org/software/octave/ wikitext text/x-wiki http://www.gnu.org/software/octave/ 7e197f02bc455523572acba454e133379a5b1cef Eclipse 0 9 196 141 2010-02-23T10:45:37Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] * [[TeXlipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 5f928e04879b3b54611f691c8a696843e210f1f2 Linux 0 3 200 37 2010-02-23T11:27:58Z Root 1 wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. 632c1c366d25327415a8e28eae9457873a52218f Libraries 0 17 201 94 2010-02-23T11:29:55Z Root 1 wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure with <code>--prefix=$HOME</code> * Update [[Linux]] run-time environment <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> * Update [[Linux]] compile-time environment (for [[Autotools]] and [[C/C++]]) <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> f9401f81dd2db98017a6041d3f4d1a9b1251073a 218 201 2010-03-16T05:41:58Z Root 1 wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure <source lang="bash"> ./configure --prefix=$HOME </source> If there is an alternative third-party software (for example, MPI: LAM, Open MPI, MPICH-1, MPICH-2), configure to a subfolder <source lang="bash"> ./configure --prefix=$HOME/SUBDIR </source> * Build and install <source lang="bash"> make all install </source> * Update [[Linux]] run-time environment <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> * Update [[Linux]] compile-time environment (for [[Autotools]] and [[C/C++]]) <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> 6edc8ef6a00ecd00a002b5bc1808a460457a3677 220 218 2010-03-16T05:47:36Z Root 1 wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure <source lang="bash"> ./configure --prefix=$HOME </source> If there are alternatives for the third-party software or if the third-party software may be built with alternative middleware (for example, MPI: LAM, Open MPI, MPICH-1, MPICH-2), configure to a subfolder <source lang="bash"> ./configure --prefix=$HOME/SUBDIR </source> * Build and install <source lang="bash"> make all install </source> * Update [[Linux]] run-time environment <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> * Update [[Linux]] compile-time environment (for [[Autotools]] and [[C/C++]]) <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> e0908fba4e886d440ef2a90e7bb909ff55924a9b 232 220 2010-03-17T14:21:23Z Root 1 /* Manual installation */ wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure <source lang="bash"> ./configure --prefix=$HOME </source> If there are alternatives for the third-party software or if the third-party software may be built with alternative middleware (for example, MPI: LAM, Open MPI, MPICH-1, MPICH-2), configure to a subfolder to avoid overwriting <source lang="bash"> ./configure --prefix=$HOME/SUBDIR </source> The packages you are developing and testing should also be in subfolders to avoid mismatch during building (see the compile-time environment below) * Build and install <source lang="bash"> make all install </source> * Update [[Linux]] run-time environment <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> * Update [[Linux]] compile-time environment (for [[Autotools]] and [[C/C++]]) <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> 983d78498813e0c4bcba469894b3a5266023f36e 233 232 2010-03-24T15:03:42Z Root 1 /* Manual installation */ wikitext text/x-wiki == Installation from packages == * Install a development package <code>*-dev</code>, which includes header files and static libraries. Base packages will be installed automatically * Headers and libraries can be found in standard paths, which are searched by compilers by default == Manual installation == * Download a package or export/checkout a repository to <code>$HOME/src/DIR</code> * Configure <source lang="bash"> ./configure --prefix=$HOME </source> If there are alternatives for the third-party software or if the third-party software may be built with alternative middleware (for example, MPI: LAM, Open MPI, MPICH-1, MPICH-2), configure to a subfolder to avoid overwriting <source lang="bash"> ./configure --prefix=$HOME/SUBDIR </source> The packages you are developing and testing should also be in subfolders to avoid mismatch during building (see the compile-time environment below) or you need to do make uninstall before building the new version of your code. * Build and install <source lang="bash"> make all install </source> * Update [[Linux]] run-time environment <source lang="bash"> PATH=$HOME/bin:$PATH LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH </source> * Update [[Linux]] compile-time environment (for [[Autotools]] and [[C/C++]]) <source lang="bash"> CPATH=$HOME/include:$CPATH LIBRARY_PATH=$HOME/lib:$LIBRARY_PATH </source> e0d287347a75b44060f7d16d75eb9f41100a3be9 MPI 0 29 202 137 2010-03-02T16:44:35Z Root 1 wikitext text/x-wiki == Implementations == * Open MPI: [http://www.open-mpi.org] == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } <source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. 721ba1e4e56aecb7f6085a8c826861dcf73c6034 203 202 2010-03-02T16:44:43Z Root 1 wikitext text/x-wiki == Implementations == * Open MPI: [http://www.open-mpi.org] == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. fb45818db2f92329486af419a0ff598fb176f0b4 219 203 2010-03-16T05:44:00Z Root 1 wikitext text/x-wiki == Implementations == * Open MPI: [http://www.open-mpi.org] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. e8bf7a22608cc24ddcbc7df63c39c684aacbc635 HCL cluster 0 5 204 165 2010-03-03T17:42:33Z Rhiggins 4 /* Compilation on HCL */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM of your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by rpm<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required. #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 186833a8be11473be31204ae8ccaf9c6cc6be458 205 204 2010-03-03T17:46:43Z Rhiggins 4 wikitext text/x-wiki http://hcl.ucd.ie/Hardware == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 0a8ae110687f83d029810d9a7d7a3f47256ae1ed Other UCD Resources 0 41 207 2010-03-03T18:10:36Z Rhiggins 4 New page: You will need login credentials to view this page: http://www.csi.ucd.ie/content/accounts-and-systems AIX/PowerPC, Solaris/x86 and Linux/x86 systems are available and can be handy for te... wikitext text/x-wiki You will need login credentials to view this page: http://www.csi.ucd.ie/content/accounts-and-systems AIX/PowerPC, Solaris/x86 and Linux/x86 systems are available and can be handy for testing reliability of your code across many platforms. 12d27558d3ecdbac19bb4604b1733a9aacb8be8b Eclox 0 11 210 21 2010-03-08T23:00:03Z Root 1 wikitext text/x-wiki [[Doxygen]] for [[Eclipse]] http://eclox.eu/ Eclipse install http://download.gna.org/eclox/update/site.xml If Doxygen is integrated in autoconf/automake scripts, there is no need in running Doxygen using Eclox, but the editor of Doxygen files provided by Eclox may be useful. d28e857b9aceea30607b6bcb633a8f7ba59f5894 Putty 0 42 216 2010-03-15T17:09:25Z Kiril 3 New page: If you use Putty to log on to the HCL cluster and notioe strange encoding of your terminal, go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cl... wikitext text/x-wiki If you use Putty to log on to the HCL cluster and notioe strange encoding of your terminal, go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) ff7a015669ea2dd06feba0a65803597a474e26d5 217 216 2010-03-15T17:10:20Z Kiril 3 wikitext text/x-wiki If you use Putty to log on to the HCL cluster and notice strange encoding of some characters on the console, in Putty go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) bbe6cbceaf23c220446334a03abcb07847ea66eb 226 217 2010-03-16T11:29:52Z Kiril 3 wikitext text/x-wiki == Encoding == If you use Putty to log on to the HCL cluster and notice strange encoding of some characters on the console, in Putty go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) 3b850a5dbbd789f0cc4c324ec33857a837a26c01 R 0 24 221 126 2010-03-16T06:30:01Z Root 1 wikitext text/x-wiki http://cran.r-project.org/ == Installation from sources == 1. R sould be configured as a shared library <source lang="bash"> $ ./configure --prefix=DIR --enable-R-shlib=yes $ make install </source> 2. Set up environment <source lang="bash"> $ export R_HOME=DIR/lib/R $ export LD_LIBRARY_PATH=$R_HOME/lib:$LD_LIBRARY_PATH </source> 3. Install required packages <source lang="bash"> $ DIR/bin/R > install.packages(c(”sandwich”, ”strucchange”, ”zoo”)) </source> 4. Set up developer environment <source lang="bash"> $ export CPATH=$R_HOME/include:$CPATH $ export LIBRARY_PATH=$R_HOME/lib:$LIBRARY_PATH </source> d0045b582e18896c4ac47e5f062441487685fca3 Windows 0 4 227 9 2010-03-16T15:56:45Z Kiril 3 /* SSH clients */ wikitext text/x-wiki == SSH clients == * putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ If you use Putty to log on to the HCL cluster and notice strange encoding of some characters on the console, in Putty go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) * cygwin http://www.cygwin.com/ (includes X windows and can be used with X11 forwarding) 72019cc4ffa7422b376797c0c9ba555b969d271e 228 227 2010-03-16T15:56:58Z Kiril 3 /* SSH clients */ wikitext text/x-wiki == SSH clients == * putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ ** If you use Putty to log on to the HCL cluster and notice strange encoding of some characters on the console, in Putty go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) * cygwin http://www.cygwin.com/ (includes X windows and can be used with X11 forwarding) c59740deceb2533a8d9688476cc362e66f9fd3b6 229 228 2010-03-16T15:57:59Z Kiril 3 /* SSH clients */ wikitext text/x-wiki == SSH clients == * putty http://www.chiark.greenend.org.uk/~sgtatham/putty/ If you use Putty to log on to the HCL cluster and notice strange encoding of some characters on the console, in Putty go to Window → Translation and set the character set to UTF-8 ( same encoding as on the HCL cluster nodes) * cygwin http://www.cygwin.com/ (includes X windows and can be used with X11 forwarding) 72019cc4ffa7422b376797c0c9ba555b969d271e Autotools 0 21 234 225 2010-03-30T08:38:37Z Rhiggins 4 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Tutorials == * http://www.lrde.epita.fr/~adl/autotools.html (very nice set of slides) == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD export PATH=$HOME/$ARCH/bin:$PATH wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 33a9d72ec29ad61abaf420dc879d0cbf9c33d132 Main Page 0 1 235 230 2010-04-17T09:09:31Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]] * [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees c639d9101c58137f8444017105804950f525b16f 236 235 2010-04-17T09:10:23Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]], [[Boost]] * [[GSL]], [[Octave]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 7012930558427226c092f1660f4411ca5fbb7187 237 236 2010-04-17T09:11:29Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[R]] * [[Octave]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees daf96ee33607134ce4a9bbdc04e81f970274478f 238 237 2010-04-17T09:11:47Z Root 1 /* Data processing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[SSH]] * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 5d828aa6ac7d41337945c33c1c49f3d02c8e2e90 241 238 2010-04-21T07:48:45Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[How to connect to cluster via SSH|SSH]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 891b6c82291cf3fc22e0342e175f07faaff040d7 242 241 2010-04-21T07:49:28Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees a7870af552b0bd4510207049c459af0faf721abe Python 0 43 239 2010-04-17T09:12:58Z Root 1 New page: Recommended for scripting and software integration wikitext text/x-wiki Recommended for scripting and software integration ec33fa3c17f3148641e3e721bda6a6e48f7b8d14 240 239 2010-04-17T09:13:24Z Root 1 Removing all content from page wikitext text/x-wiki da39a3ee5e6b4b0d3255bfef95601890afd80709 261 240 2010-04-21T09:14:41Z Root 1 wikitext text/x-wiki http://www.python.org/ d0825a75aec4b45113643e63336fac8f7756e715 Linux 0 3 243 200 2010-04-21T07:52:58Z Root 1 wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. == Tips and Tricks == * [[SSH|How to connect via SSH]] 6bb87c6b6479a196fde231aa4045002053a2336e Grid5000 0 6 244 15 2010-04-21T08:13:50Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Tips and Tricks == * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of sites, access.SITE1.grid5000.fr, you can directly ssh any frontend node: '''frontend.SITE2'''. * Jobs are run from the frondend nodes, using [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR] * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it: scp, sftp, rsync. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. c33744b871262eca5614fa860ed5083c644ca1d9 245 244 2010-04-21T08:22:37Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Tips and Tricks == * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * Jobs are run from the frondend nodes, using a PBS-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission *** interactive job <source lang="bash"> $ oarsub -I </source> *** batch job <source lang="bash"> $ oarsub path_to_batch_file </source> ** '''oardel''' - job removal * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it: scp, sftp, rsync. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. 03d403710adaf611ae275ae516414764090dfded 246 245 2010-04-21T08:23:08Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Tips and Tricks == * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * Jobs are run from the frondend nodes, using a PBS-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission *** interactive job <source lang="bash">$ oarsub -I</source> *** batch job <source lang="bash">$ oarsub path_to_batch_file</source> ** '''oardel''' - job removal * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it: scp, sftp, rsync. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. 33a9b2ec9e0641ced254a2875e158ccd035d1411 247 246 2010-04-21T08:23:34Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Tips and Tricks == * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * Jobs are run from the frondend nodes, using a PBS-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission *** interactive job <source lang="bash">$ oarsub -I</source> *** batch job <source lang="bash">$ oarsub path_to_batch_file</source> ** '''oardel''' - job removal * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it: scp, sftp, rsync. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. abf5ec2222ce56c223a5beaaebb83e4a796fad1f 248 247 2010-04-21T08:32:46Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Tips and Tricks == * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * Jobs are run from the frondend nodes, using a PBS-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it: scp, sftp, rsync. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. d08a04cbca144bd995dffb870d15b3574768e755 249 248 2010-04-21T08:41:42Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and cluters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a PBS-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> a5e46db9ef59fdeef7f7c757a3f4224bcbcf9e50 250 249 2010-04-21T08:49:43Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and cluters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy] * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] bd86a92463bf777f7ec3f567991da75227b3148e 251 250 2010-04-21T08:57:19Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Some revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 6a3d8467eb324dd4a403d725499127932e861651 MPI 0 29 252 219 2010-04-21T08:59:01Z Root 1 /* Implementations */ wikitext text/x-wiki == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. e463ed648d4ea4499817a27f7867224085c22568 255 252 2010-04-21T09:06:43Z Root 1 wikitext text/x-wiki == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] http://www.mpi-forum.org/docs/docs.html == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. 1852f7355713f46540d2c86446782e14a0dec712 256 255 2010-04-21T09:06:55Z Root 1 wikitext text/x-wiki http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. 7c2a4b32a06da1db8629165c00a1bc388d32fa0b 257 256 2010-04-21T09:07:46Z Root 1 wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. 6a2a79f16c61da3f370b6ded7507ef6a13385c91 MPICH2 0 44 253 2010-04-21T09:00:10Z Root 1 New page: <code class="command">echo</code> "MPD_SECRETWORD=<code class="replace">secret</code>" > $HOME/.mpd.conf}} {{Term|location=frontend|cmd=<code class="command">chmod</code> 600 $HOME/.mpd.co... wikitext text/x-wiki <code class="command">echo</code> "MPD_SECRETWORD=<code class="replace">secret</code>" > $HOME/.mpd.conf}} {{Term|location=frontend|cmd=<code class="command">chmod</code> 600 $HOME/.mpd.conf}} Then you can use a script like this to launch mpd/mpirun: NODES=`uniq < $OAR_NODEFILE | wc -l | tr -d ' '` NPROCS=`wc -l < $OAR_NODEFILE | tr -d ' '` mpdboot --rsh=oarsh --totalnum=$NODES --file=$OAR_NODEFILE sleep 1 mpirun -n $NPROCS <code class="replace">mpich2binary</code> deab2890eea83d3c19357635a3707cb6488841f6 254 253 2010-04-21T09:05:15Z Root 1 wikitext text/x-wiki Settings for MPICH2 daemon: <source lang="bash"> $ echo "MPD_SECRETWORD=XXX" > ~/.mpd.conf $ chmod 600 ~/.mpd.conf </source> Script for running application: <source lang="bash"> NODES=`uniq < $OAR_NODEFILE | wc -l | tr -d ' '` NPROCS=`wc -l < $OAR_NODEFILE | tr -d ' '` mpdboot --rsh=ssh --totalnum=$NODES --file=$OAR_NODEFILE sleep 1 mpirun -n $NPROCS path_to_executable </source> 14dfae1846d106656fed0c1a05956364ca4ff705 MPICH 0 45 258 2010-04-21T09:10:13Z Root 1 New page: Doesn't support shared libraries linked againts MPICH, because of MPICH global variables. wikitext text/x-wiki Doesn't support shared libraries linked againts MPICH, because of MPICH global variables. 6518d010b50117e2d950b79ccc67d7b00deb2301 LAM 0 46 259 2010-04-21T09:11:49Z Root 1 New page: Before use, run <source lang="bash"> $ lamboot [-d -v] hostfile </source> wikitext text/x-wiki Before use, run <source lang="bash"> $ lamboot [-d -v] hostfile </source> 11f50db01f5572c0ed02b4b0488cb28c48b49678 OpenMPI 0 47 260 2010-04-21T09:12:33Z Root 1 New page: http://www.open-mpi.org/faq/ wikitext text/x-wiki http://www.open-mpi.org/faq/ 0db66b3172e2c86965d11a031d6e748bbe026639 HCL cluster 0 5 262 205 2010-04-23T12:53:34Z Rhiggins 4 wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * fftw * git * gnuplot * netperf * octave * qhull * subversion * valgrind == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> ad38ee639132b10fea4e19f7c8da3127565bb0cd 263 262 2010-04-23T13:15:27Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * fftw2 * git * gnuplot * netperf * octave3.2 * qhull * subversion * valgrind == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 744cc9618c5df1e7768c2c04ca028ddff0ec913e 264 263 2010-04-23T13:16:26Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * aclocal * fftw2 * git * gnuplot * netperf * octave3.2 * qhull * subversion * valgrind == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 6ffe9e3617f1f88cb4dccfa815d671d60e2a64ff 265 264 2010-04-23T13:16:54Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 981c884fd134a948f88eb91a37a8323aabb005e6 266 265 2010-04-23T13:19:29Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph * openmpi-bin -dev == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> ed3df2f7a2eca07d9f4f677c4d86b97b6fddb00f 267 266 2010-04-23T13:19:56Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph * openmpi-bin -dev == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 79a3989b487f565a37e09fe8decc8e0d0acd0c22 268 267 2010-04-23T13:21:42Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph-dev * libboost-serialization-dev * openmpi-bin * openmpi-dev * gsl-dev * mc * x-11 * autotools == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 361ddb3821a4632f7ee179fef3a45d859de29025 269 268 2010-04-23T13:23:14Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph-dev * libboost-serialization-dev * openmpi-bin * openmpi-dev * gsl-dev * mc * x-11 * autotools * vim * graphviz * doxygen == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 0003d96b8fdffe37eed0d46f161ba2e6f6ed3f17 270 269 2010-04-23T13:23:53Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph-dev * libboost-serialization-dev * openmpi-bin * openmpi-dev * gsl-dev * mc * x-11 * autotools * vim * graphviz * doxygen * evince == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 1bcf4a321632fa18367d434912ed82d26c358468 271 270 2010-04-23T13:25:23Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph-dev * libboost-serialization-dev * openmpi-bin * openmpi-dev * gsl-dev * mc * xorg * autotools * vim * graphviz * doxygen * evince * pyton == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 7a744c24c700b1c15495bbcd7a8a837d5b8d8e91 272 271 2010-04-23T13:25:30Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * r-cran-strucchange * libboost-graph-dev * libboost-serialization-dev * openmpi-bin * openmpi-dev * gsl-dev * mc * xorg * autotools * vim * graphviz * doxygen * evince * python == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 95631f0030ff0fff53c7cebd0debdb3f03eb328a 273 272 2010-04-23T13:44:21Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> b47818c8ed203ea91e8c2435c6ea04fdd5126468 274 273 2010-04-23T16:57:53Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen Install Log of new heterogeneous.ucd.ie: [install log] == == Headline text == == == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 190231f7a5218e7e0696e0eeaf64927f4ac17581 275 274 2010-04-23T16:58:12Z Rhiggins 4 Undo revision 274 by [[Special:Contributions/Rhiggins|Rhiggins]] ([[User talk:Rhiggins|Talk]]) wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> b47818c8ed203ea91e8c2435c6ea04fdd5126468 276 275 2010-04-23T16:59:20Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> b3de4a31b3df3d3c76bd3b90caca486486350b45 281 276 2010-04-24T13:51:29Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> a26fb9cdc8808f49a6e3a31574922f5dea3fcc0b 282 281 2010-04-24T13:51:39Z Rhiggins 4 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> be1145b4a497c60916072cdaccf15332f57f01a9 HCL cluster/heterogeneous.ucd.ie install log 0 48 277 2010-04-23T19:52:30Z Rhiggins 4 New page: * Basic installation of Debian Sarge * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </sou... wikitext text/x-wiki * Basic installation of Debian Sarge * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update apt-get install drbl /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp and then edit /etc/dhcpd3/dhcpd.conf c231ea3e4bd35e3a06401dcbfc8d00c800d377fd 278 277 2010-04-24T12:30:31Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Sarge * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update apt-get install drbl /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp and then edit /etc/dhcpd3/dhcpd.conf fc125eff5b34f157130e00dc5e0c5f5050f5574f 279 278 2010-04-24T12:42:21Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Sarge * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> 70abb84cd627bb7758bd67d12a95f6ce39334960 280 279 2010-04-24T13:25:11Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware ** Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> :* Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> 8c80f06c3e86db36c9bb3bb4d7d399f818da789b HCL cluster/hcl node install configuration log 0 49 283 2010-04-24T14:21:25Z Rhiggins 4 New page: HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the... wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname of the root node. A bug describing this issue can be found [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). ote, the hostname of the machine will be set to the last interface that is configured by dhcp, in the current configuration that will be <code>eth0</code>. If interfaces are reconfigured using dhclient, the hostnames will change. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Interfaces== 9268c8da8c345c33fc923695b6dba6dea5b57d2b HCL cluster/hcl node install configuration log 0 49 284 283 2010-04-24T14:31:35Z Rhiggins 4 /* Hostnames */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Interfaces== 39da215c71ca8e9f8343efb054d883043c382f8d 285 284 2010-04-24T14:57:21Z Rhiggins 4 /* udev and Interfaces */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [[http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 3bb4950b8c0deb2a9ddfe6a70617f35cd1dc73dc 286 285 2010-04-24T16:43:16Z Rhiggins 4 /* udev and Network Interfaces */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> f5757d6211e82feff4ff1575e3683527b40746e9 287 286 2010-04-24T18:58:38Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ==Ganglia== Install the packages gmetad ganglia-montior and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" localhost Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> c0b2589445f148ff5cd47314e28022faf5e1c30c 288 287 2010-04-24T19:00:48Z Rhiggins 4 /* Ganglia */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ==Ganglia== Install the packages gmetad ganglia-montior and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" localhost Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service apache2 restart service gmetad restart service gmond restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 562827371dc2f989e5b700b3ff60b081213ab22a 289 288 2010-04-24T19:01:26Z Rhiggins 4 /* Ganglia */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ==Ganglia== Install the packages gmetad ganglia-montior and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" localhost Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service apache2 restart service gmetad restart service gmond restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 772cc8dd8a2cfef2a6848537cc0b97a26394f8e0 292 289 2010-04-24T20:14:24Z Rhiggins 4 /* Ganglia */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> d6436a51f07242a31d82dc7fbfb035bd43aa3a57 293 292 2010-04-25T13:06:53Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname <source lang="text> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 9820e85104e9007bb80ca8a5473228fbdd21e6e7 294 293 2010-04-25T13:07:12Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 122e5652d8fc142c48c48d7b7729b91f06969d37 297 294 2010-04-25T17:47:45Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code> 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 59c505a98ba17ac7a92c88be94579d65656abcb5 298 297 2010-04-25T17:48:38Z Rhiggins 4 /* NIS Client */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code> 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadown</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> f8495b3a422de82d8b78c7a6df548c5a58dddbbf 299 298 2010-04-25T17:50:47Z Rhiggins 4 /* NIS Client */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> b0c7ac0ce18bbd0f0aefa63b0a6b537887989224 300 299 2010-04-25T17:51:26Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> d179f14d8dc404bae395f743a97703e3dbac8248 HCL cluster/heterogeneous.ucd.ie install log 0 48 290 280 2010-04-24T20:11:29Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware ** Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> :* Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Installing Ganglia Frontend== Install the webfrontend package, this will also install the gmetad data collector. apt-get install ganglia-webfrontend Edit <code>/etc/ganglia/gmetad.conf</code> changing the data_source line to: data_source "HCL Cluster" hcl07 eabe61a35e4986de47e59cf771e6656d727d260f 291 290 2010-04-24T20:13:21Z Rhiggins 4 /* Installing Ganglia Frontend */ wikitext text/x-wiki * Basic installation of Debian Squeeze * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware ** Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> :* Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. d90ca49617cfc3d3b2782a79dbcc9a43aaf7f9a3 295 291 2010-04-25T13:59:48Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware ** Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> :* Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. 767cce8df7dc22a14a73e3b68528f3a389910832 296 295 2010-04-25T15:57:18Z Rhiggins 4 /* Install NIS */ wikitext text/x-wiki * Basic installation of Debian Squeeze * edit /etc/networks/interfaces <source lang="text"> iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware ** Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> :* Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. e0b85632b1aba07120f499b3772f622f8e1f91c5 330 296 2010-04-27T10:34:34Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware for network interface (eth0). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. c3335d19a3a0e7989d0961d7395f298c42b1e304 331 330 2010-04-27T11:38:40Z Rhiggins 4 /* Networking */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * set resolv.conf: <source lang="text"> nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> * Install non-free linux firmware for network interface (eth0). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. 59ced10d4703e4d2601a590dd14c8d042f9ede20 332 331 2010-04-27T11:53:35Z Rhiggins 4 /* Networking */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 192.168.21.254 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; <source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code>: * Install non-free linux firmware for network interface (eth0). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. aa1ad8accf5c0268262ec80ff96c27f8c7a78b7b 333 332 2010-04-27T11:54:07Z Rhiggins 4 /* DNS / BIND */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; <source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code>: * Install non-free linux firmware for network interface (eth0). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. 959b51c40f0657034d42a62b0639682ff98299f2 HCL cluster 0 5 301 282 2010-04-26T15:37:27Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 03311989befbda50e045bead622ab7300ee4c8f8 302 301 2010-04-26T15:37:44Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware * [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 7142a4eb24a313e67a0e5ef838e929b665f8c098 319 302 2010-04-26T16:25:57Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware [[General Information]] [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> ee0d92f3aa8bf390cd214514af7ae5be1f4a6fd5 320 319 2010-04-26T16:35:32Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:http://csserver.ucd.ie/~bbecker/hcl/network.jpg]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> 4a88e61277593ef700010fd90f54d45d1681b54d 321 320 2010-04-26T16:38:45Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[File:http://csserver.ucd.ie/~bbecker/hcl/network.jpg|thumb|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> e3e1ce5116a3d2fbc9b4954be251f17bfedc6eed 322 321 2010-04-26T16:39:34Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[File:network.jpg|thumb|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] 134f2e0ce35a2194d9416b28cbf1036b828664cf 323 322 2010-04-26T16:42:59Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] [[File:network.jpg|thumb|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] a126c57e01bc81b4dad851fbb106834a5ab5b2b9 324 323 2010-04-26T16:46:27Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== <span class="plainlinks">[http://linktopage http://csserver.ucd.ie/~bbecker/hcl/network.jpg]</span> [[Image:network.jpg|right|Layout of the Cluster]] [[File:network.jpg|thumb|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] 91754f5140556311eeeddb50f42d5bdd6e8923c4 325 324 2010-04-26T16:46:43Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] [[File:network.jpg|thumb|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] a126c57e01bc81b4dad851fbb106834a5ab5b2b9 326 325 2010-04-26T16:46:57Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == HCL Cluster 2.0 == In preperation for a fresh installation of operating systems on HCL Cluster the follow list of packages are requested. After upgrade is complete, this list will become a reference for installed software: Done * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] d33003b184d2da5be048b6a7bfd014a6cebeef18 327 326 2010-04-26T16:48:41Z Davepc 2 /* HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. [[Cluster Specification]] == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] 35432b850da0d6ca7722e2a4b4e193cbc40d5f90 328 327 2010-04-26T16:50:35Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] 7c5732266b1cbe3d2c5e4760fdf81c9e3b4c0bb0 329 328 2010-04-26T16:56:06Z Davepc 2 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Compilation on HCL == * Add to your environment <source lang="bash"> export ARCH=`uname -r` if [ `hostname` == 'hcl13.ucd.ie' ]; then export ARCH=`uname -r`smp fi </source> * On hcl09 and hcl10, create a directory $HOME/$ARCH. Actually, 2.4.27-2-386 and 2.6.11-1.1369_FC4smp directories will be created. * Configure all the software on hcl09 and hcl10, with <code>--prefix=$ARCH</code> == Installing Precompiled Software on HCL Fedora Core 4 nodes== If you need a piece of software that is not already installed you may not need to compile it from source. It is possible to install pre-compiled packages to folders in your home directory<ref>http://www.ajay.ws/2009/7/10/install-a-rpm-in-home-directory-as-non-root</ref>, and this is often easier than compiling from source. The cluster uses two different OSes at present, Fedora Core 4 and Debian. These both have different pre-compiled package formats, RPM for Fedora and .deb for Debian. Unfortunately relocating the default install directory of a .deb package is troublesome, so this article only pertains to Fedora. Fedora Core 4 is was EOL'd in 2006<ref>http://fedoraproject.org/wiki/LifeCycle/EOL</ref>and 4 years later there are few remaining resources supporting it. You can still find Fedora 4 RPMs at the following links (long unsearchable lists, give them time to load): http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/4/i386/os/Fedora/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/extras/4/i386 http://ftp.heanet.ie/pub/fedora-archive/fedora/linux/core/updates/4/i386 What follows is an example of installing the 'gv' (ghostview) package into a folder defined by an environment variable: <code>$APPSDIR</code> (which has a similar to purpose to <code>$ARCH</code> in the previous section). #find an RPM file for your desired package<br/> #create rpm database directory in local tree (only do this once):<br/><code>mkdir -p $APPSDIR/var/lib/rpm</code> #initialise rpm database (do this only once also):<br/><code>rpm --initdb --root $APPSDIR</code> #check what files are installed by the RPM file of your package<br/><code>rpm -qlp gv-3.6.1-4.fc4.i386.rpm</code> #test for dependencies (install will not be successful and command will report an error accessing the root rpm database, ignore this):<br/><code>rpm -ivh gv-3.6.1-4.fc4.i386.rpm</code><br/>if a dependency is found, the RPM for it must also be found and appended to the <code>rpm -i</code> command (until no further dependencies are required). #do actual install to our local folder:<br/><code>rpm --root $APPSDIR --relocate /usr=$APPSDIR --nodeps -ivh gv-3.6.1-4.fc4.i386.rpm</code> [[Image:Example.jpg]] ffb7aeb665a465a4fe5ea66746f5a15c6fea56ad HCL Cluster Specifications 0 50 303 2010-04-26T15:44:37Z Davepc 2 New page: {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1<COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=116></COLGROUP><COLGROUP><COL WIDTH=160></COLGROUP><COLGROUP><COL WIDTH=100... wikitext text/x-wiki {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1<COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=116></COLGROUP><COLGROUP><COL WIDTH=160></COLGROUP><COLGROUP><COL WIDTH=100></COLGROUP><COLGROUP><COL WIDTH=100></COLGROUP><COLGROUP><COL WIDTH=111></COLGROUP><COLGROUP><COL WIDTH=103></COLGROUP><COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=86></COLGROUP><COLGROUP><COL WIDTH=86></COLGROUP><TBODY> | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | O/S | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 12.2(25)SEB2 | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 12.2(25)SEB2 | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | Debian | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | Debian | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | FC4 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | Debian | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | Debian | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | Debian | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 37d4aac658f0a2c2c93b76dfdfbef84eb681d72f 304 303 2010-04-26T15:53:36Z Davepc 2 wikitext text/x-wiki {| class="wikitable" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} f55f0284aee68258ae374f6b3c4dd674c28db8e0 305 304 2010-04-26T15:54:48Z Davepc 2 wikitext text/x-wiki {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} ba4adc284b2ff7eeb6504579caaf84b4a3e71b1b 306 305 2010-04-26T15:56:12Z Davepc 2 wikitext text/x-wiki {| cellpadding="2" style="border: 1px solid darkgray;" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid blue;"| [[File:StarIconBronze.png|120px]] | style="border: 1px solid #777777;"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 94e8d160ab8d9581b5af5b40b955cf7e3505b0fa 307 306 2010-04-26T15:57:47Z Davepc 2 wikitext text/x-wiki {| cellpadding="2" style="border: 1px solid darkgray;" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid #777777;"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} bdabd19dd453a0cca7845cab0ab3abf34df73e69 308 307 2010-04-26T15:59:38Z Davepc 2 wikitext text/x-wiki {| cellpadding="2" style="border: 1px solid darkgray;" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} e32d1ea81572c3195f58aff4f01602513e467521 309 308 2010-04-26T16:00:10Z Davepc 2 wikitext text/x-wiki {| cellpadding="2" style="border: 1px solid darkgray;" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 More cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} bc1fc3b2b301f6dadd86c814f559cbc6bd0930be 310 309 2010-04-26T16:02:24Z Davepc 2 wikitext text/x-wiki [[test table]] {| cellpadding="2" style="border: 1px solid darkgray;" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 More cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} c258f7127783b76fa92955afbc948f08e81287d5 312 310 2010-04-26T16:05:16Z Davepc 2 wikitext text/x-wiki [[test table]] {| border="1" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 More cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1 style="border: 1px solid darkgray;" | WIDTH=86 HEIGHT=16 ALIGN=LEFT | Rack Slot | WIDTH=116 ALIGN=LEFT | Name | WIDTH=160 ALIGN=LEFT | Make/Model | WIDTH=100 ALIGN=LEFT | IP | WIDTH=111 ALIGN=LEFT | Processor | WIDTH=103 ALIGN=LEFT | Front Side Bus | WIDTH=86 ALIGN=LEFT | L2 Cache | WIDTH=86 ALIGN=LEFT | RAM | WIDTH=86 ALIGN=LEFT | HDD 1 | WIDTH=86 ALIGN=LEFT | HDD 2 | WIDTH=86 ALIGN=LEFT | NIC |- | HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;" | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;" | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY" | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;" | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;" | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;" | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} b952ffa6825cf58c9af94d2d1cc9c36bd8035739 314 312 2010-04-26T16:09:27Z Davepc 2 wikitext text/x-wiki {| border="1" ! width="140" | Left ! width="150" | Middle ! width="130" | Right |- align="center" | style="border: 1px solid"| a | style="border: 1px solid"| [[File:StarIconGold.png|120px|Caption when mouse-over image]] | style="border: 1px solid #22AA55;"|<!--greenish border--> [[File:StarIconGreen.png|120px|Green stellar icon]] |- align="center" |Bronze star || Gold star || Green star |} {| cellpadding="2" style="border: 1px solid darkgray;" |- ! header 1 ! header 2 ! header 3 |- | row 1, cell 1 More cell 1 | row 1, cell 2 | row 1, cell 3 |- | row 2, cell 1 | row 2, cell 2 | row 2, cell 3 |} {| border="1" cellspacing="0" | Rack Slot | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC |- | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;" | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;" | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;" | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;" | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;" | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;" | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;" | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;" | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;" | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;" | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;" | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;" | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 7bc735b438a4981a46efad29d62f36d42a9e2d2f 315 314 2010-04-26T16:12:58Z Davepc 2 wikitext text/x-wiki {| border="1" cellspacing="0" | Rack | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC |- | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 6ecc3817d67b7f1e6b92e53f68975ee746087414 316 315 2010-04-26T16:13:34Z Davepc 2 wikitext text/x-wiki {| border="1" cellspacing="1" | Rack | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC |- | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 7c08c1fe403abbb65ef6166a1994d59bf755f081 317 316 2010-04-26T16:14:14Z Davepc 2 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Rack | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC |- | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 240GB SCSI | ALIGN=LEFT | 80GB SCSI | ALIGN=LEFT | 2 x Gigabit |- | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} b2157e7d9044965ee8c565d649a982cefe86b472 318 317 2010-04-26T16:22:12Z Davepc 2 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Rack | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC |- | 42 | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 41 | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit |- | 1 – 2 | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A |- | 3 | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit |- | 3 | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 4 | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit |- | 4 | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 5 | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 5 | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 6 | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 6 | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 7 | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 7 | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 8 | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 8 | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 9 | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 9 | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 10 | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 10 | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 11 | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 11 | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;" | 12 | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 13 | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 13 | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 14 | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 14 | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 15 | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 15 | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 16 | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 16 | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 17 | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 17 | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |- | 18 | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit |- | 18 | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> |} 11cbdc18c199b5290b5a3ddea6a6c9a4c7a01ca4 Test table 0 51 311 2010-04-26T16:02:34Z Davepc 2 New page: {{Redirect|Times table|a table of departure and arrival times|Timetable}} In [[mathematics]], a '''multiplication table''' (sometimes, less formally, a '''times table''') is a [[mathematic... wikitext text/x-wiki {{Redirect|Times table|a table of departure and arrival times|Timetable}} In [[mathematics]], a '''multiplication table''' (sometimes, less formally, a '''times table''') is a [[mathematical table]] used to define a multiplication [[binary operation|operation]] for an algebraic system. The [[decimal]] multiplication table was traditionally taught as an essential part of elementary arithmetic around the world, as it lays the foundation for arithmetic operations with our base-ten numbers. Many educators believe it is necessary to memorize the table up to 9 × 9. In countries using the [[Imperial system]] of measurement, such as the United States, it is often considered helpful to memorize the table up to 12 × 12. {| class="wikitable" style="text-align:right;" ! × ! 1 || 2 || 3 || 4 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 16 || 17 || 18 || 19 || 20 |- ! 1 | 1 || 2 || 3 || 4 || 5 || 6 || 7 || 8 || 9 || 10 || 11 || 12 || 13 || 14 || 15 || 16 || 17 || 18 || 19 || 20 |- ! 2 | 2 || 4 ||6 || 8 || 10 || 12 || 14 || 16 || 18 || 20 || 22 || 24 || 26 || 28 || 30 || 32 || 34 || 36 || 38 || 40 |- ! 3 | 3 || 6 || 9 || 12 || 15 || 18 || 21 || 24 || 27 || 30 || 33 || 36 || 39 || 42 || 45 || 48 || 51 || 54 || 57 || 60 |- ! 4 | 4 || 8 || 12 || 16 || 20 || 24 || 28 || 32 || 36 || 40 || 44 || 48 || 52 || 56 || 60 || 64 || 68 || 72 || 76 || 80 |- ! 5 | 5 || 10 || 15 || 20 || 25 || 30 || 35 || 40 || 45 || 50 || 55 || 60 || 65 || 70 || 75 || 80 || 85 || 90 || 95 || 100 |- ! 6 | 6 || 12 || 18 || 24 || 30 || 36 || 42 || 48 || 54 || 60 || 66 || 72 || 78 || 84 || 90 || 96 || 102 || 108 || 114 || 120 |- ! 7 | 7 || 14 || 21 || 28 || 35 || 42 || 49 || 56 || 63 || 70 || 77 || 84 || 91 || 98 || 105 || 112 || 119 || 126 || 133 || 140 |- ! 8 | 8 || 16 || 24 || 32 || 40 || 48 || 56 || 64 || 72 || 80 || 88 || 96 || 104 || 112 || 120 || 128 || 136 || 144 || 152 || 160 |- ! 9 | 9 || 18 || 27 || 36 || 45 || 54 || 63 || 72 || 81 || 90 || 99 || 108 || 117 || 126 || 135 || 144 || 153 || 162 || 171 || 180 |- ! 10 | 10 || 20 || 30 || 40 || 50 || 60 || 70 || 80 || 90 || 100 || 110 || 120 || 130 || 140 || 150 || 160 || 170 || 180 || 190 || 200 |- ! 11 | 11 || 22 || 33 || 44 || 55 || 66 || 77 || 88 || 99 || 110 || 121 || 132 || 143 || 154 || 165 || 176 || 187 || 198 || 209 || 220 |- ! 12 | 12 || 24 || 36 || 48 || 60 || 72 || 84 || 96 || 108 || 120 || 132 || 144 || 156 || 168 || 180 || 192 || 204 || 216 || 228 || 240 |- ! 13 | 13 || 26 || 39 || 52 || 65 || 78 || 91 || 104 || 117 || 130 || 143 || 156 || 169 || 182 || 195 || 208 || 221 || 234 || 247 || 260 |- ! 14 | 14 || 28 || 42 || 56 || 70 || 84 || 98 || 112 || 126 || 140 || 154 || 168 || 182 || 196 || 210 || 224 || 238 || 252 || 266 || 280 |- ! 15 | 15 || 30 || 45 || 60 || 75 || 90 || 105 ||120 || 135 || 150 || 165 || 180 || 195 || 210 || 225 || 240 || 255 || 270 || 285 || 300 |- ! 16 | 16 || 32 || 48 || 64 || 80 || 96 || 112 || 128 || 144 || 160 || 176 || 192 || 208 || 224 || 240 || 256 || 272 || 288 || 304 || 320 |- ! 17 | 17 || 34 || 51 || 68 || 85 || 102 || 119 || 136 || 153 || 170 || 187 || 204 || 221 || 238 || 255 || 272 || 289 || 306 || 323 || 340 |- ! 18 | 18 || 36 || 54 || 72 || 90 || 108 || 126 || 144 || 162 || 180 || 198 || 216 || 234 || 252 || 270 || 288 || 306 || 324 || 342 || 360 |- ! 19 | 19 || 38 || 57 || 76 || 95 || 114 || 133 || 152 || 171 || 190 || 209 || 228 || 247 || 266 || 285 || 304 || 323 || 342 || 361 || 380 |- ! 20 | 20 || 40 || 60 || 80 || 100 || 120 || 140 || 160 || 180 || 200 || 220 || 240 || 260 || 280 || 300 || 320 || 340 || 360 || 380 || 400 |} ==Traditional use== In 493 A.D., [[Victorius of Aquitaine]] wrote a 98-column multiplication table which gave (in [[Roman numerals]]) the product of every number from 2 to 50 times and the rows were "a list of numbers starting with one thousand, descending by hundreds to one hundred, then descending by tens to ten, then by ones to one, and then the fractions down to 1/144" (Maher & Makowski 2001, p.383) The traditional rote learning of multiplication was based on memorization of columns in the table, in a form like <div style="-moz-column-count:2;"> 1 × 10 = 10 2 × 10 = 20 3 × 10 = 30 4 × 10 = 40 5 × 10 = 50 6 × 10 = 60 7 × 10 = 70 8 × 10 = 80 9 × 10 = 90 10 x 10 = 100 11 x 10 = 110 12 x 10 = 120 13 x 10 = 130 14 x 10 = 140 15 x 10 = 150 16 x 10 = 160 17 x 10 = 170 18 x 10 = 180 19 x 10 = 190 100 x 10 = 1000 </div> This form of writing the multiplication table in columns with complete number sentences is still used in some countries instead of the modern grid above. ==Patterns in the tables== There is a pattern in the multiplication table that can help people to memorize the table more easily. It uses the figures below: → → 1 2 3 2 4 ↑ 4 5 6 ↓ ↑ ↓ 7 8 9 6 8 ← ← 0 0 Fig. 1 Fig. 2 For example, to memorize all the multiples of 7: # Look at the 7 in the first picture and follow the arrow. # The next number in the direction of the arrow is 4. So think of the next number after 7 that ends with 4, which is 14. # The next number in the direction of the arrow is 1. So think of the next number after 14 that ends with 1, which is 21. # After coming to the top of this column, start with the bottom of the next column, and travel in the same direction. The number is 8. So think of the next number after 21 that ends with 8, which is 28. # Proceed in the same way until the last number, 3, which corresponds to 63. # Next, use the 0 at the bottom. It corresponds to 70. # Then, start again with the 7. This time it will correspond to 77. # Continue like this. Figure 1 is used for multiples of 1, 3, 7, and 9. Figure 2 is used for the multiples of 2, 4, 6, and 8. These patterns can be used to memorize the multiples of any number from 1 to 9, except 5. ==In abstract algebra== Multiplication tables can also define binary operations on [[group (mathematics)|group]]s, [[field (mathematics)|field]]s, [[ring (mathematics)|ring]]s, and other [[Abstract algebra|algebraic systems]]. In such contexts they can be called [[Cayley table]]s. For an example, see [[octonion]]. ==Standards-based mathematics reform in the USA== In 1989, the [[National Council of Teachers of Mathematics]] (NCTM) developed new standards which were based on the belief that all students should learn higher-order thinking skills, and which recommended reduced emphasis on the teaching of traditional methods that relied on rote memorization, such as multiplication tables. Widely adopted texts such as [[Investigations in Numbers, Data, and Space]] (widely known as [[TERC]] after its producer, Technical Education Research Centers) omitted aids such as multiplication tables in early editions. It is thought by many{{Who|date=November 2009}} that electronic calculators have made it unnecessary or counter-productive to invest time in memorizing the multiplication table. NCTM made it clear in their 2006 [[Principles and Standards for School Mathematics#Curriculum Focal Points|Focal Points]] that basic mathematics facts must be learned, though there is no consensus on whether rote memorization is the best method. ==See also== * [[Vedic square]] [[Category:Multiplication]] [[Category:Mathematics education]] [[az:Vurma cədvəli]] [[be:Табліца памнажэння]] [[be-x-old:Табліца множаньня]] [[de:Einmaleins]] [[el:Πίνακας πολλαπλασιασμού]] [[es:Tabla de multiplicar]] [[fr:Table de multiplication]] [[ko:구구법]] [[hi:पहाड़ा]] [[io:Tabulo multipliko]] [[it:Tavola pitagorica]] [[lt:Daugybos lentelė]] [[mk:Таблица множење]] [[nl:Tafels van vermenigvuldiging]] [[ja:九九]] [[no:Multiplikasjonstabell]] [[pl:Tabliczka mnożenia]] [[pt:Tabuada de multiplicar]] [[ru:Таблица умножения]] [[scn:Tàvula pitagorica]] [[si:ගුණ කිරීමේ වගුව]] [[fi:Kertotaulu]] [[sv:Multiplikationstabell]] [[th:สูตรคูณ]] [[tg:Ҷадвали Пифагор]] [[uk:Таблиця множення]] [[vi:Bản cửu chương]] [[zh:乘法表]] d3d7c118a29e076b26ffdaf2cdb959b57119af1d 313 311 2010-04-26T16:05:48Z Davepc 2 Removing all content from page wikitext text/x-wiki da39a3ee5e6b4b0d3255bfef95601890afd80709 HCL cluster/heterogeneous.ucd.ie install log 0 48 334 333 2010-04-27T11:56:27Z Rhiggins 4 /* Networking */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; <source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code>: ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. a32925153abef2e1ffa6f88c9b90607c50aafd9c 335 334 2010-04-27T11:57:43Z Rhiggins 4 /* Interfaces */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; <source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code>: ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. a44b1fa9858f31366315cc0c7c87b57dce9d0de6 336 335 2010-04-27T11:58:19Z Rhiggins 4 /* DNS / BIND */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. 8ec799ec34d9b9c3e1e6a708c8458b18b9990569 338 336 2010-04-27T23:51:28Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9, edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; allow-update { key "rndc-key"; }; notify yes; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; allow-update { key "rndc-key"; }; notify yes; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. 94f405330bfae69e0191a53d18a15f5ef4892099 353 338 2010-04-29T15:51:07Z Rhiggins 4 /* DNS / BIND */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. d40543560d6aa3f94c666903a9b106ebf5ef292f 354 353 2010-04-29T15:53:54Z Rhiggins 4 /* IP Tables */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnet 255.255.255.0 192.168.21.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. 920351393e08979f7d5fe63d30d698b322e58f88 355 354 2010-04-29T15:55:27Z Rhiggins 4 /* Install NIS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" hcl07 After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://192.168.21.254/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. eb74ee44d7867ecf98481a69a5f65403d0755a73 356 355 2010-04-29T15:57:52Z Rhiggins 4 /* Installing Ganglia Frontend */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. 80193f448683454b1a4e817dc8ed14c6ed138175 357 356 2010-04-29T16:02:18Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. 0f88f9a581a27bde910e3dbb73087433f4ea0eb8 358 357 2010-04-29T18:53:23Z Davepc 2 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> 5f6ee376383287c4dee632e974eccefe8a8049c4 359 358 2010-04-29T18:57:59Z Davepc 2 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> <code>qmgr -c 'set server query_other_jobs=TRUE' /etc/init.d/smartmontools start</code> 03204166ef6cc49df4c4ebc68063c354e6646ce6 360 359 2010-04-29T18:58:16Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> <code>/etc/init.d/smartmontools start</code> 1fcb6df6feb39da9aadcd91458fd155c7eb9d6c3 361 360 2010-04-29T18:59:11Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contails: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Edit <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all other lines are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> 681d6dea432aea90d88876814ea014944e6989f3 362 361 2010-04-30T08:52:30Z Rhiggins 4 /* Disk Monitoring */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> 40aa6afd5eb94f7fd5cf9281bb223a33e344ed93 363 362 2010-05-02T16:50:21Z Rhiggins 4 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> d68ca7de5deaa05587b6b021c3fb7cd6a7c68cf5 369 363 2010-05-03T16:31:59Z Rhiggins 4 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. Allow all users to see all queued jobs: <code>qmgr -c 'set server query_other_jobs=TRUE'</code> 59ac49c7605647d87415e4a0079880c1d5d7bece 370 369 2010-05-05T15:30:45Z Rhiggins 4 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source> QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] PRIORITY=10000 QFLAGS=PREEMPTEE QOSCFG[lowpri] PRIORITY=1000 QFLAGS=PREEMPTEE QOSCFG[service] PRIORITY=100 QFLAGS=PREEMPTOR QOSCFG[volunteer] PRIORTIY=0 QFLAGS=PREEMPTEE 1d803731d7d9148da9fe6507c603367ecd4f0494 371 370 2010-05-05T18:05:10Z Rhiggins 4 /* Queue Setup */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> fbe010354e842154e22d1a2dc1a4aa3108321ffb HCL cluster 0 5 339 329 2010-04-28T10:41:08Z Root 1 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 9a29d93b210c15a24b5775ec2d3d39c690b077a2 348 339 2010-04-28T15:04:18Z Rhiggins 4 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster. <code>root_login.expect</code> is will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # ./root_login.expect usage: root_login.expect <host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do ./root_login.expect $i ps ax \| grep pbs; done == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 4a81a7201402b790eefa749048da4443560887c2 350 348 2010-04-28T15:06:44Z Davepc 2 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster. <code>root_login.expect</code> is will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # ./root_login.expect usage: root_login.expect <host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do ./root_login.expect $i ps ax \| grep pbs; done == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 0fb3ccd24a9e6f5575c9dbe31031bb9a91177787 351 350 2010-04-28T15:08:40Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster. <code>root_login.expect</code> is will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # ./root_login.expect usage: root_ssh.expect <host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do ./root_login.expect $i ps ax \| grep pbs; done == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 74b99a540929f64b9f381da175f5a665203c1a2a 352 351 2010-04-28T19:56:50Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>. <code>root_ssh</code> is will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh <host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 567100c2feaa88ed4c1c7fdc59ee729248a04d8a 367 352 2010-05-03T11:23:19Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>. <code>root_ssh</code> is will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh <host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 28dd9b401b889c7bd92032b445c49be9f61e9026 368 367 2010-05-03T11:23:58Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) 2cf49f345f4c377caca71ec17b7ba5b92a5085bd 377 368 2010-05-11T14:05:33Z Kiril 3 /* Access and Security */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == 2f20b9c4f882ede5922f6aefc8cd30437f68cfc9 378 377 2010-05-11T14:06:40Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == If "/sbin/route" gives: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 dff91cb742a2951f7ff5441cb44086067f9fc915 379 378 2010-05-11T14:09:08Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 f8d3ca6fcf65c0ac9eaa9961f1281b41a511a67a 380 379 2010-05-11T14:12:02Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes most machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... b7ce1979195e0c509913a4f5b9da3076dd69d2ce 381 380 2010-05-11T14:14:31Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched: 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 1f31840c03e1dce53c36251459be2b8f34781630 382 381 2010-05-11T14:14:51Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 4f808b74a4783e8edfcd3e31027832e9dcc1a766 383 382 2010-05-11T14:15:52Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node 4908703a58d690567f7b687a0aa253627483f76a 384 383 2010-05-11T14:29:30Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node 64c9549f1d61a342355cc264c67bc8c9aa18fcfc File:Cisco3560Specs.pdf 6 53 340 2010-04-28T11:11:10Z Root 1 Cisco Catalyst 3560 Specifications wikitext text/x-wiki Cisco Catalyst 3560 Specifications 4b65c68532502a30e442cd367fa6da73f30ad68a File:Cisco3560Guide.pdf 6 54 341 2010-04-28T11:13:35Z Root 1 Cisco Catalyst 3560 User Guide wikitext text/x-wiki Cisco Catalyst 3560 User Guide dbbdbe3345ce28516acdb82b66aa36c860da0c46 File:X306.pdf 6 55 342 2010-04-28T11:14:21Z Root 1 IBM x-Series 306 Documentation wikitext text/x-wiki IBM x-Series 306 Documentation 1ae1a0a95138bbebdd736940d5dd3729a6f1d8f9 File:E326.pdf 6 56 343 2010-04-28T11:14:37Z Root 1 IBM e-Series 326 Documentation wikitext text/x-wiki IBM e-Series 326 Documentation 894da751335afed6632f1d6d7a016eaf2c721b8b File:Proliant100SeriesGuide.pdf 6 57 344 2010-04-28T11:15:24Z Root 1 HP Proliant DL-140 G2 Documentation wikitext text/x-wiki HP Proliant DL-140 G2 Documentation 0b6d294be92379bcecdb4232bfc7bc6e3ac814cd File:ProliantDL320G3Guide.pdf 6 58 345 2010-04-28T11:15:37Z Root 1 HP Proliant DL-320 G3 Documentation wikitext text/x-wiki HP Proliant DL-320 G3 Documentation 702f3867bbd026166e73876fcb4957f48bc990fa File:PE750.tgz 6 59 346 2010-04-28T11:21:48Z Root 1 Dell Poweredge 750 Documentation wikitext text/x-wiki Dell Poweredge 750 Documentation d0ce2636206e92aeb19a111bb165042db136e9aa File:SC1425.tgz 6 60 347 2010-04-28T11:22:13Z Root 1 Dell Poweredge SC1425 Documentation wikitext text/x-wiki Dell Poweredge SC1425 Documentation 184f09876712d23dff966fe691169c086a354a5f File:Network.jpg 6 61 349 2010-04-28T15:06:00Z Davepc 2 Network layout wikitext text/x-wiki Network layout 69891ecef8cabd9222053d0fc0ca40e9342de77e HCL cluster/hcl node install configuration log 0 49 364 300 2010-05-02T16:51:53Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth1 eth0 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 8a848f9ed74a42b77bd4eeb4d171ff9d1785124b 365 364 2010-05-02T16:57:26Z Rhiggins 4 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </code> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 72e2e91c4f82f9ed1e9c60f637568297774d8e21 366 365 2010-05-02T16:57:47Z Rhiggins 4 /* Routing Tables */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> 24e0dd798d96e634a060e7d830b19dd05d14d44f 376 366 2010-05-11T14:04:02Z Rhiggins 4 /* Complications */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> db0d976ee402725dc745221684ad3adc1dcbcc8c SSH 0 36 372 190 2010-05-07T10:58:05Z Kiril 3 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Host hcl01 Hostname hcl01.ucd.ie ProxyCommand ssh -qax heterogeneous nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 1d4745d803f4e6519a7de44d7d25731e14b9e891 373 372 2010-05-07T10:59:33Z Kiril 3 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. Put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 2651e7b7bc5ca23e7dbafb11980882b581f8f65d 374 373 2010-05-07T11:01:29Z Kiril 3 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 7164d9ef8dc0c370d169818513b20cf997124a40 MPI 0 29 375 257 2010-05-07T18:08:01Z Rhiggins 4 /* Tips & Tricks */ wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. e7cf8a568ab043bb5ea4462311b4db8d1300cbd4 HCL cluster 0 5 385 384 2010-05-11T15:06:36Z Rhiggins 4 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. 72e98e8421c56e8e529f90d63e7768b121f3c3f3 386 385 2010-05-11T15:09:33Z Rhiggins 4 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. 1225a69db1e541fa2a219a53b15a2d6fa55e9cf9 387 386 2010-05-11T15:11:41Z Rhiggins 4 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected. d7cd016cab3d9ce9540f3d1ac0a2b2e000facde7 388 387 2010-05-11T15:12:22Z Rhiggins 4 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1: mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. ac2573fac6c347fafee1659b51cd6c3a0d85ecb2 Other UCD Resources 0 41 389 207 2010-05-19T11:00:14Z Rhiggins 4 wikitext text/x-wiki CSI has a number of systems available for use, you will need login credentials to view this page: http://www.csi.ucd.ie/content/accounts-and-systems AIX/PowerPC, Solaris/x86 and Linux/x86 systems are available and can be handy for testing reliability of your code across many platforms. UCD has a Pay-for-Use cluster named Phaeton, information on that may be viewed here: https://login.ucd.ie/ 01e2f4f326735c13e1cfe7a80736373727db0547 C/C++ 0 14 390 231 2010-05-20T09:45:17Z Root 1 /* Coding */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 507b4bf07b3d1e24420f9568bd2334d8b2a9dbcf 391 390 2010-05-20T09:51:48Z Root 1 /* Coding */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] 410ea26936d23b2321c11574a228cdb072a08957 BLAS LAPACK ScaLAPACK 0 15 392 33 2010-06-08T15:26:38Z Root 1 wikitext text/x-wiki A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK] * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK). * ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2] * MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage] f713e9c868a10a73b0cc30b9abb57dfab67a3ef3 HCL Cluster Specifications 0 50 393 318 2010-06-14T15:30:44Z Davepc 2 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} 139e877be260b2bbc7ea006f8dac2ede6c04142b 394 393 2010-06-15T16:08:26Z Rhiggins 4 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | Hcl01 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | Hcl02 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | Hcl03 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | Hcl04 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | Hcl05 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | Hcl06 (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | Hcl07 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | Hcl08 (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | Hcl09 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | Hcl10 (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | Hcl11 (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | Hcl12 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | Hcl13 (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | Hcl14 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | Hcl15 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | Hcl16 (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} 4dbab47523572eff7ee7d0e763a65f0d492e7b3a Main Page 0 1 395 242 2010-06-17T08:49:28Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Some Maths == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees 993353d54edab5f4df3096b62cb8de03c297a3e3 397 395 2010-06-17T09:08:54Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)]. [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution], which is used to find confidence interval, is implemented in [[GSL]] (<code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree]. Use [[Graphviz]] to visualize trees * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 18232ba30b8210be4928b4f1fc738b5dd9990dae 398 397 2010-06-17T09:10:46Z Root 1 /* Mathematics */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]], <code>gsl_cdf_tdist_Pinv</code>) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] c49ca0c752c8966c414fa8d638c1ea8e8fa52435 399 398 2010-06-17T09:10:59Z Root 1 /* Mathematics */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[Latex]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 72ca4ec18505203225b2834673206c3159f03742 404 399 2010-06-17T09:15:46Z Root 1 /* Presentation */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 89e1214b11b7954d182d16f1e0004ef2ea0e814c GNU C Library 0 62 396 2010-06-17T08:49:44Z Root 1 New page: http://en.wikipedia.org/wiki/GNU_C_Library http://www.gnu.org/s/libc/manual/html_node/index.html wikitext text/x-wiki http://en.wikipedia.org/wiki/GNU_C_Library http://www.gnu.org/s/libc/manual/html_node/index.html 0b2fc61dcfad0f98ec3a4b0a6683730662a429ab LaTeX 0 20 400 199 2010-06-17T09:13:44Z Root 1 /* Editors */ wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] * [[http://www.texniccenter.org/ TeXnicCenter]] (for Windows) acd3792e8aa5479b4549e093aa61e18fbf19b5b5 401 400 2010-06-17T09:15:20Z Root 1 wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] == Windows == * [[http://miktex.org/ MiKTeX]] - LaTeX implementation * [[http://www.texniccenter.org/ TeXnicCenter]] - editor 05fef3a7fea26c14da5d0af5d29242582995fcdf 402 401 2010-06-17T09:15:35Z Root 1 [[Latex]] moved to [[LaTeX]] wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] == Windows == * [[http://miktex.org/ MiKTeX]] - LaTeX implementation * [[http://www.texniccenter.org/ TeXnicCenter]] - editor 05fef3a7fea26c14da5d0af5d29242582995fcdf 405 402 2010-06-17T09:16:11Z Root 1 wikitext text/x-wiki * Beamer - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] == Windows == * [http://miktex.org/ MiKTeX] - LaTeX implementation * [http://www.texniccenter.org/ TeXnicCenter] - editor 5499d45ff9da10eca94d638a16b80ac52fab2eca Latex 0 63 403 2010-06-17T09:15:35Z Root 1 [[Latex]] moved to [[LaTeX]] wikitext text/x-wiki #REDIRECT [[LaTeX]] e8da4acbf53aab9269cca95417457a2fb359055b Gnuplot 0 22 406 45 2010-06-17T14:37:13Z Root 1 wikitext text/x-wiki http://www.gnuplot.info/documentation.html http://t16web.lanl.gov/Kawano/gnuplot/index-e.html [http://gnuplot.sourceforge.net/demo/ examples] 5e2a07544faafca80cbb014f79a0ccb32d8c171d 407 406 2010-06-17T14:37:25Z Root 1 wikitext text/x-wiki http://www.gnuplot.info/documentation.html http://t16web.lanl.gov/Kawano/gnuplot/index-e.html [http://gnuplot.sourceforge.net/demo/ Examples] 56ab25df170fca4abb3ac6598e42551bd1e7def0 408 407 2010-06-17T14:37:49Z Root 1 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] http://t16web.lanl.gov/Kawano/gnuplot/index-e.html [http://gnuplot.sourceforge.net/demo/ Examples] ef3feb7cb5903b828d1151eb92cad3655fc5da20 409 408 2010-06-17T14:38:01Z Root 1 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] http://t16web.lanl.gov/Kawano/gnuplot/index-e.html [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] a040922102c2a8b89849ca0222db1c1f1371b79a 410 409 2010-06-17T14:38:42Z Root 1 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] [http://t16web.lanl.gov/Kawano/gnuplot/index-e.html GNUPLOT: not so Frequently Asked Questions] f911acf2d8923f76bc77bf548648fa4924827b7e Linux 0 3 411 243 2010-06-21T12:33:02Z Root 1 wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. == Tips and Tricks == * [[SSH|How to connect via SSH]] * Use <code>update-alternatives --config NAME</code> to switch between different software implementations. For example, <code>update-alternatives --config java</code> allows you to switch between Sun, OpenJDK and GNU java 0bdf5ccf98d264efcad9a59bd9202d43f27fa45f Eclipse 0 9 412 196 2010-06-21T12:35:45Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Install Sun Java 6 (Debian package: sun-java6-jre). If you experience problems with Sun, use OpenJDK (Debian package: openjdk-6-jre). Don't use GNU Java - it's too slow. * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] * [[TeXlipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) f47df2fd7a870ca4c6dfbbf24eed8428e891a1e7 Grid5000 0 6 413 251 2010-06-21T16:53:07Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 30dc84bd9faa88ab4144247011eaec183e59af71 414 413 2010-06-21T16:59:22Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: '''frontend.SITE2'''. * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> $ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> $ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Loading an image: <source lang="bash"> $ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 78d47c2df0cbac5f3c9a7c3474df0ac9c5b1ff08 415 414 2010-06-21T17:05:18Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] c0c0266c32d1a26c0bde80ea087b1586ff8d466f 416 415 2010-06-21T17:06:11Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == # Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] # Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> # There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. # Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. # Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> # The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. # Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 3babd143453cfee2bdee125a0b15017d483d548f 417 416 2010-06-21T17:06:42Z Root 1 Undo revision 416 by [[Special:Contributions/Root|Root]] ([[User talk:Root|Talk]]) wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes: '''access.SITE.grid5000.fr'''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] c0c0266c32d1a26c0bde80ea087b1586ff8d466f 418 417 2010-06-22T10:38:42Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from everywhere''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] a63bb6a430d63528ac9940b63d588799a75836df 419 418 2010-06-22T10:49:38Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 3a0601f886d7f4dd7452d1277fbe51c183ff6f85 420 419 2010-06-22T10:52:54Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] c143d25b687954ec428ad4fac6e1e8a1c7abefe7 421 420 2010-06-22T10:58:36Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -e PATH_TO_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] d98f8e1b5e11e3b9e560b9fe811830b804d575a4 422 421 2010-06-22T11:29:44Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] 89eb798a342fc34276d66531ee911a74c1573742 423 422 2010-06-22T15:13:23Z Kiril 3 /* Usage */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] * Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) * MPI applications should be launched from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) 7f76eaf3040ab48e70a9117c73b61da5b72066c3 424 423 2010-06-22T15:14:19Z Kiril 3 /* Usage */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Usage == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. * Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] ** mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) 84f65e7088725652d77ad6a643df5d74df62785f 426 424 2010-06-22T16:24:26Z Kiril 3 /* Usage */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compilation and running jobs == * Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] ** mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) 88b7d4fbe45607525c3b51c3aa5e7a7f338fd8bc 427 426 2010-06-22T16:24:54Z Kiril 3 /* Login, job submission, deployment */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compilation and running jobs == * Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] ** mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) 721dcbd102383f7d9e4b4939e80e8d1c71ae197d 428 427 2010-06-22T16:25:33Z Kiril 3 /* Compilation and running jobs */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == * Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] * Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> * There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. * Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. * Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: ** '''oarstat''' - queue status ** '''oarsub''' - job submission ** '''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> * The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == * Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) * Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] ** mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) ae3135a5d57a10072ef3e9d5560707ddd6e3a56e MPICH2 0 44 425 254 2010-06-22T15:29:52Z Kiril 3 wikitext text/x-wiki Settings for MPICH2 daemon: <source lang="bash"> $ echo "MPD_SECRETWORD=XXX" > ~/.mpd.conf $ chmod 600 ~/.mpd.conf </source> Script for running application: <source lang="bash"> NODES=`uniq < $OAR_NODEFILE | wc -l | tr -d ' '` NPROCS=`wc -l < $OAR_NODEFILE | tr -d ' '` mpdboot --rsh=ssh --totalnum=$NODES --file=$OAR_NODEFILE sleep 1 mpirun -n $NPROCS path_to_executable </source> In MPICH2, you normally don't need to specify -machinefile explicitly for Grid5000 (OAR machinefiles are automatically read) b5c969bec96c3056a3edb6c0b412fca707cfec0c SSH 0 36 429 374 2010-06-25T22:40:07Z Kiril 3 /* Passwordless SSH */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automate the inclusion of hostname to "known_host" == == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> c0845e8b26845ac75815f8817e1bcc75d0b78747 430 429 2010-06-25T22:46:03Z Kiril 3 /* Automate the inclusion of hostname to "known_host" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script says "yes" when asked if #!/usr/bin/expect -f set arg1 [lindex $argv 0] spawn ssh $arg1 expect "yes" send "yes\r" send "exit\r" expect eof You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `cat hostfile` ; do ./say-yes.exp $i done ~ ~ == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 8fd87815966885aa68cb05adebf86365b2eb0cf4 431 430 2010-06-25T22:47:01Z Kiril 3 /* Automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] spawn ssh $arg1 expect "yes" send "yes\r" send "exit\r" expect eof You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `cat hostfile` ; do ./say-yes.exp $i done ~ ~ == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 1a036e9d18a60904b7efeff428782ca528fd2b3c 432 431 2010-06-25T22:47:53Z Kiril 3 /* Automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] spawn ssh $arg1 expect "yes" send "yes\r" send "exit\r" expect eof You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `cat hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 3e90e2b8bcc6608b7d8f4005147a9f86244c520c 433 432 2010-06-25T22:52:24Z Kiril 3 /* Automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" (NOT WORKING PROPERLY == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] spawn ssh $arg1 expect "yes" send "yes\r" send "exit\r" expect eof You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `cat hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> cb89dd762650ca9763e84416255d9e164bf1485d 434 433 2010-06-25T22:52:40Z Kiril 3 /* Automatically saying "yes" (NOT WORKING PROPERLY */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" (NOT WORKING PROPERLY) == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] spawn ssh $arg1 expect "yes" send "yes\r" send "exit\r" expect eof You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `cat hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> da43c268b4e28664891ae721c7de9a78c25eb086 SSH 0 36 435 434 2010-06-26T00:49:24Z Kiril 3 /* Automatically saying "yes" (NOT WORKING PROPERLY) */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" (NOT WORKING PROPERLY) == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" send "exit\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: cat yes-everywhere.sh for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 722c3c7089f84c5b8c8db18341b503b40ec2d01a 436 435 2010-06-26T00:51:20Z Kiril 3 /* Automatically saying "yes" (NOT WORKING PROPERLY) */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" (NOT WORKING PROPERLY) == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" send "exit\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 88ac70ccac069e065a4421f03fb748d28ba0c2cd 437 436 2010-06-26T00:51:34Z Kiril 3 /* Automatically saying "yes" (NOT WORKING PROPERLY) */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" send "exit\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 7e03d16b0e0d4f7420e8ab5759d019dbeb6a7a40 438 437 2010-06-26T00:54:27Z Kiril 3 /* Automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> b08f055406afad89881c09a538b545cff91930c7 483 438 2010-09-16T18:22:07Z Davepc 2 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[Access and Security ]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> b345339c586ba1ecdaa0ccab5dd84a6422691854 Eclipse 0 9 439 412 2010-06-30T10:37:53Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Use either Sun Java (Debian package: sun-java6-jre) or OpenJDK (Debian package: openjdk-6-jre). GNU Java (GCJ) may be too slow. If you experience problems with Sun Java, it may be IPv6 - to resolve it, add <code>-Djava.net.preferIPv4Stack=true</code> in the end of '''eclipse.ini''' * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] * [[TeXlipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 9d51d8f3964b1649c2bac9858ac1d834e1bb238b 440 439 2010-06-30T10:40:09Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Use either Sun Java (Debian package: sun-java6-jre) or OpenJDK (Debian package: openjdk-6-jre). GNU Java (GCJ) may be too slow. If you experience problems with Sun Java, it may be IPv6 - to resolve it, add '''a new line''' <code>-Djava.net.preferIPv4Stack=true</code> in the end of '''eclipse.ini''' (it must be after <code>-vmargs</code>) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - a distribution for C/C++ development * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Linuxtools]] * [[Eclox]] * [[Subversive]] or [[Subclipse]] * [[TeXlipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 7cd93864b25411b87286bf7177ad0bc0436e0ca0 OpenMPI 0 47 441 260 2010-06-30T22:25:51Z Kiril 3 wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf . For example : cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 1de1913686de75aee1b23cd12e4c454146998307 442 441 2010-06-30T22:26:14Z Kiril 3 /* MCA parameter files */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: For example : cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 449a1b0a065689af88726efeaa26dab0e2159e6d 444 442 2010-06-30T22:32:13Z Kiril 3 /* MCA parameter files */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 ce199bf63ff2d92dd6b90049fe2941841f8a7a78 HCL cluster 0 5 443 388 2010-06-30T22:28:59Z Kiril 3 /* Some networking issues on HCL cluster (unsolved) */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. 5559c6aa4fa0c2fed3fdc704c01f0da8e1fcc926 445 443 2010-07-05T11:30:48Z Rhiggins 4 wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] If you notice a machine has gone down (particularly hcl11) on the cluster, it is likely to be as a result of this. We can consider disabling overcommit which will probably result in the termination of the types of experiments that cause the heavy paging, rather than some other system process. This would probably be most desirable but it is up for discussion. 7a94b7351091b056317c60c6773e650da2bc7b72 446 445 2010-07-09T08:11:21Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] If you notice a machine has gone down (particularly hcl11) on the cluster, it is likely to be as a result of this. We can consider disabling overcommit which will probably result in the termination of the types of experiments that cause the heavy paging, rather than some other system process. This would probably be most desirable but it is up for discussion. dd902a5790eadece760305e3ea3e9eaf05e97146 447 446 2010-07-09T10:31:24Z Davepc 2 /* Paging and the OOM-Killer */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 If you notice a machine has gone down (particularly hcl11) on the cluster, it is likely to be as a result of this. We can consider disabling overcommit which will probably result in the termination of the types of experiments that cause the heavy paging, rather than some other system process. This would probably be most desirable but it is up for discussion. 8804e766c5d9b9ab0e7b3c2ca0a76ae293c72ab7 448 447 2010-07-09T10:53:00Z Davepc 2 /* Paging and the OOM-Killer */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio d0428b39adccf3062346222a715c01418eaaf99c 450 448 2010-07-28T11:53:43Z Rhiggins 4 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: * export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio a26443f1e041a18532c4a855dea00e694bf5b261 451 450 2010-07-28T11:54:01Z Rhiggins 4 /* APT */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 7d17190283410442735b61852b2273516bdb7f1b 452 451 2010-07-28T13:04:11Z Rhiggins 4 /* APT */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 74d4effaba666693ed2b9db069d1566989dc1ec4 456 452 2010-08-18T08:38:13Z Root 1 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 1f717ee9efab7584bf58ddbfc2732279e9639d24 457 456 2010-08-18T08:41:24Z Root 1 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram below shows a schematic of the cluster. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 1cc9902c48e861888db433c772c19153629dca06 458 457 2010-08-18T08:41:54Z Root 1 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] The diagram shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[Cluster Specification]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 6eeb7a4205f61f07c389d3c6cc6caa2c1952ef27 461 458 2010-08-18T10:21:28Z Root 1 /* Detailed Cluster Specification */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] The diagram shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[HCL Cluster Specifications]] [[Old HCL Cluster Specifications]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 486ef9b3fe31c97c3dc3e09503a7e738efd9ec82 462 461 2010-08-18T10:21:43Z Root 1 /* Detailed Cluster Specification */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] The diagram shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[HCL Cluster Specifications]] ([[Old HCL Cluster Specifications]]) == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio c27bfb63439d00780b375332448d8fc54a854368 465 462 2010-08-18T10:33:32Z Root 1 /* General Information */ wikitext text/x-wiki http://hcl.ucd.ie/Hardware ==General Information== [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] The diagram shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[HCL Cluster Specifications]] ([[Old HCL Cluster Specifications]]) == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio c50e86dbf0c96bd80441a2e0ab5d3b7bf7e30bea 466 465 2010-08-18T10:33:54Z Root 1 wikitext text/x-wiki ==General Information== [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] The diagram shows a schematic of the cluster. == Detailed Cluster Specification == A table of hardware configuration is available here: [[HCL Cluster Specifications]] ([[Old HCL Cluster Specifications]]) == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 4ef58503b7655ebb038b4943570302d85f60d7a0 467 466 2010-08-18T10:42:32Z Root 1 wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio c76a6993997796a768ff6d4cb49e9e2d8d4d4c32 479 467 2010-09-16T18:04:53Z Davepc 2 /* Access and Security */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16). Access to the nodes is controlled by torque pbs. qsub -I -l walltime=1:00 qsub -l nodes=3 myscript.sh Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 91c22384f9a7d378e0345f515046588748851ce2 480 479 2010-09-16T18:19:04Z Davepc 2 /* Access and Security */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by torque pbs.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 409af9c185b612baa52ce73b2fee54ef1d986304 481 480 2010-09-16T18:20:18Z Davepc 2 /* Access to the nodes is controlled by torque pbs. */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by torque pbs.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 154f00502ea6c0f934d8eb355ea3a5c3ac816540 482 481 2010-09-16T18:20:47Z Davepc 2 /* Access to the nodes is controlled by torque pbs. */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 54c6ba16fa43c9b2c10fe24db38fa8435a0cf47f 484 482 2010-09-16T18:23:20Z Davepc 2 /* General Information */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (Pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio 80633af832cee9b9249f6f1cbbfa4c4ce6ffc700 HCL Cluster Specifications 0 50 449 394 2010-07-25T14:33:37Z Rhiggins 4 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} 639a62a5dac6e6ffd43573bb8e8de887921736d4 453 449 2010-07-29T12:13:14Z Rhiggins 4 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- 16d6852dee8c75bcd0fa13773b29fad23432a33f 454 453 2010-07-29T13:28:38Z Rhiggins 4 /* Stream */ wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- 438f1a43feabdf07482cd0f30ea9021f60cffb66 459 454 2010-08-18T10:20:33Z Root 1 [[Cluster Specification]] moved to [[HCL Cluster Specifications]] wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- 438f1a43feabdf07482cd0f30ea9021f60cffb66 470 459 2010-08-22T11:29:02Z Rhiggins 4 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== ===Cluster Performance=== ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Number of processors = 18 Array size = 2000000 Offset = 0 The total memory requirement is 824.0 MB ( 45.8MB/task) You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity appears to be less than one microsecond Your clock granularity/precision appears to be 1 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 24589.1750 0.0235 0.0234 0.0237 Scale: 24493.9786 0.0237 0.0235 0.0245 Add: 27594.1797 0.0314 0.0313 0.0315 Triad: 27695.7938 0.0313 0.0312 0.0315 ----------------------------------------------- Solution Validates! ----------------------------------------------- ===Individual Node Performance=== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- d834b9e180bdb9c2b8ee8d2920081815b841d4a9 471 470 2010-08-22T11:29:54Z Rhiggins 4 wikitext text/x-wiki {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== ====Cluster Performance==== ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Number of processors = 18 Array size = 2000000 Offset = 0 The total memory requirement is 824.0 MB ( 45.8MB/task) You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity appears to be less than one microsecond Your clock granularity/precision appears to be 1 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 24589.1750 0.0235 0.0234 0.0237 Scale: 24493.9786 0.0237 0.0235 0.0245 Add: 27594.1797 0.0314 0.0313 0.0315 Triad: 27695.7938 0.0313 0.0312 0.0315 ----------------------------------------------- Solution Validates! ----------------------------------------------- ====Individual Node Performance==== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- 8b2ed67edeef6e33a9deb707e8c9b8cddff89095 472 471 2010-08-22T11:31:04Z Rhiggins 4 wikitext text/x-wiki ==Cluster Specifications== {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== ====Cluster Performance==== ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Number of processors = 18 Array size = 2000000 Offset = 0 The total memory requirement is 824.0 MB ( 45.8MB/task) You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity appears to be less than one microsecond Your clock granularity/precision appears to be 1 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 24589.1750 0.0235 0.0234 0.0237 Scale: 24493.9786 0.0237 0.0235 0.0245 Add: 27594.1797 0.0314 0.0313 0.0315 Triad: 27695.7938 0.0313 0.0312 0.0315 ----------------------------------------------- Solution Validates! ----------------------------------------------- ====Individual Node Performance==== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- d4052e377b595979a64c4f585915a49f2847dca8 File:Stream bench results.png 6 64 455 2010-07-29T13:29:00Z Rhiggins 4 Plots of STREAM benchmark for HCL Cluster wikitext text/x-wiki Plots of STREAM benchmark for HCL Cluster 2015d38b84c95d9ef84925a918c819c0d737bb90 Cluster Specification 0 65 460 2010-08-18T10:20:33Z Root 1 [[Cluster Specification]] moved to [[HCL Cluster Specifications]] wikitext text/x-wiki #REDIRECT [[HCL Cluster Specifications]] 2070b9135d3307d5c8ec4ad7a4a3f115fb483fa3 Old HCL Cluster Specifications 0 66 463 2010-08-18T10:30:34Z Root 1 New page: <TABLE FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1> <TR> <TD WIDTH=86 HEIGHT=16 ALIGN=LEFT>Rack Slot</TD> <TD WIDTH=116 ALIGN=LEFT>Name</TD> <TD WIDTH=160 ALIGN=LEFT>... wikitext text/x-wiki <TABLE FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1> <TR> <TD WIDTH=86 HEIGHT=16 ALIGN=LEFT>Rack Slot</TD> <TD WIDTH=116 ALIGN=LEFT>Name</TD> <TD WIDTH=160 ALIGN=LEFT>Make/Model</TD> <TD WIDTH=100 ALIGN=LEFT>O/S</TD> <TD WIDTH=100 ALIGN=LEFT>IP</TD> <TD WIDTH=111 ALIGN=LEFT>Processor</TD> <TD WIDTH=103 ALIGN=LEFT>Front Side Bus</TD> <TD WIDTH=86 ALIGN=LEFT>L2 Cache</TD> <TD WIDTH=86 ALIGN=LEFT>RAM</TD> <TD WIDTH=86 ALIGN=LEFT>HDD 1</TD> <TD WIDTH=86 ALIGN=LEFT>HDD 2</TD> <TD WIDTH=86 ALIGN=LEFT>NIC</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;">42</TD> <TD ALIGN=LEFT>hclswitch1 </TD> <TD ALIGN=LEFT>Cisco Catalyst 3560G</TD> <TD ALIGN=LEFT>12.2(25)SEB2</TD> <TD ALIGN=LEFT>192.168.21.252</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>24 x Gigabit</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;">41</TD> <TD ALIGN=LEFT>hclswitch2</TD> <TD ALIGN=LEFT>Cisco Catalyst 3560G</TD> <TD ALIGN=LEFT>12.2(25)SEB2</TD> <TD ALIGN=LEFT>192.168.21.253</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>24 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY">1 - 2</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>APC Smart UPS 1500</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;">3</TD> <TD ALIGN=LEFT>Hcl01 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge SC1425</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.3</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>240GB SCSI</TD> <TD ALIGN=LEFT>80GB SCSI</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;">3</TD> <TD ALIGN=LEFT>Hcl01 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.103</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;">4</TD> <TD ALIGN=LEFT>Hcl02 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge SC1425</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.4</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>240GB SCSI</TD> <TD ALIGN=LEFT>80GB SCSI</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;">4</TD> <TD ALIGN=LEFT>Hcl02 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.104</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;">5</TD> <TD ALIGN=LEFT>Hcl03 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.5</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;">5</TD> <TD ALIGN=LEFT>Hcl03 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.105</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;">6</TD> <TD ALIGN=LEFT>Hcl04 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.6</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;">6</TD> <TD ALIGN=LEFT>Hcl04 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.106</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;">7</TD> <TD ALIGN=LEFT>Hcl05 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.7</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;">7</TD> <TD ALIGN=LEFT>Hcl05 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.107</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;">8</TD> <TD ALIGN=LEFT>Hcl06 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.8</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;">8</TD> <TD ALIGN=LEFT>Hcl06 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.108</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;">9</TD> <TD ALIGN=LEFT>Hcl07 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.9</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;">9</TD> <TD ALIGN=LEFT>Hcl07 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.109</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;">10</TD> <TD ALIGN=LEFT>Hcl08 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.10</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;">10</TD> <TD ALIGN=LEFT>Hcl08 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.110</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;">11</TD> <TD ALIGN=LEFT>Hcl09 (NIC1)</TD> <TD ALIGN=LEFT>IBM E-server 326</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.11</TD> <TD ALIGN=LEFT>1.8 AMD Opteron</TD> <TD ALIGN=LEFT>1GHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;">11</TD> <TD ALIGN=LEFT>Hcl09 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.111</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;">12</TD> <TD ALIGN=LEFT>Hcl10 (NIC1)</TD> <TD ALIGN=LEFT>IBM E-server 326</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.12</TD> <TD ALIGN=LEFT>1.8 AMD Opteron</TD> <TD ALIGN=LEFT>1GHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;">12</TD> <TD ALIGN=LEFT>Hcl10 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.112</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;">13</TD> <TD ALIGN=LEFT>Hcl11 (NIC1)</TD> <TD ALIGN=LEFT>IBM X-Series 306</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.13</TD> <TD ALIGN=LEFT>3.2 P4</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>512MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;">13</TD> <TD ALIGN=LEFT>Hcl11 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.113</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;">14</TD> <TD ALIGN=LEFT>Hcl12 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 320 G3</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.14</TD> <TD ALIGN=LEFT>3.4 P4</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>512MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;">14</TD> <TD ALIGN=LEFT>Hcl12 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.114</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;">15</TD> <TD ALIGN=LEFT>Hcl13 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 320 G3</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.15</TD> <TD ALIGN=LEFT>2.9 Celeron</TD> <TD ALIGN=LEFT>533MHz</TD> <TD ALIGN=LEFT>256KB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;">15</TD> <TD ALIGN=LEFT>Hcl13 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.115</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;">16</TD> <TD ALIGN=LEFT>Hcl14 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.16</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;">16</TD> <TD ALIGN=LEFT>Hcl14 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.116</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;">17</TD> <TD ALIGN=LEFT>Hcl15 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.17</TD> <TD ALIGN=LEFT>2.8 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;">17</TD> <TD ALIGN=LEFT>Hcl15 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.117</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;">18</TD> <TD ALIGN=LEFT>Hcl16 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.18</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;">18</TD> <TD ALIGN=LEFT>Hcl16 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.118</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> </TABLE> 85bd020495eb9fc3c34aa41a823126c6d63c8459 File:Cluster.jpg 6 67 464 2010-08-18T10:32:42Z Root 1 HCL cluster wikitext text/x-wiki HCL cluster 6b114c65ecdac5780d0c3c58d01a42ea3fd33835 HCL Cluster Network 0 68 468 2010-08-18T10:44:27Z Root 1 New page: <p align="center"><font size="5"><strong>Switch Management</strong></font></p> <p><font size="4">The cluster is connected via two Cisco Catalyst 3560G switches. The ingress bandwidth on... wikitext text/x-wiki <p align="center"><font size="5"><strong>Switch Management</strong></font></p> <p><font size="4">The cluster is connected via two Cisco Catalyst 3560G switches. The ingress bandwidth on any physical port can be configured to any value between 8kb/s and 1Gb/s. The switches are connected to each other via a gigabit sfp cable.</font></p> <p align="left"><font size="4">As the <a href="Specs.html">Cluster Specifications</a> show, each node has two Network Interfaces, each with its own IP Address. Each eth0 is connected to switch 1, and each eth1 is connected to switch 2. Which topology you wish to use will determine which IP address you should use when referring to each machine. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Switch Access</strong></font></p> <p><font size="4">The switches can be accessed through telnet from any machine on the cluster network. Type<br> telnet 192.168.21.252 for switch1 or telnet 192.168.21.253 for switch2, and the switch should prompt for a password. For the password, email brett becker.</font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Switch Configuration</strong></font></p> <p><font size="4"><br> For example, we will demonstrate how to limit the bandwidth of the connection between hcl01 and hcl02 to 100Mbps:<br> -----------------------------------<br> hcl02 $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification</font></p> <p><font size="4">Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Password: &lt;enter password&gt;<br> hclswitch1#configure terminal<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#policy-map example<br> hclswitch1(config-pmap)#class ipclass1<br> hclswitch1(config-pmap-c)#police 100000000 800000 exceed-action drop<br> hclswitch1(config-pmap-c)#exit<br> hclswitch1(config-pmap)#exit<br> hclswitch1(config)#interface gigabitEthernet0/3<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1(config)#interface gigabitEthernet0/4<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl02 $&gt;<br> ----------------------------------------</font></p> <p><font size="4">In the above example, we telnet to the switch and enter the password. We then type &#8220;enable&#8221; and enter the password again. Then we type &#8220;configure terminal&#8221; to bring us into config mode. We then create a new policy-map named &#8220;example&#8221; We bind this policy-map to class &#8220;ipclass1&#8221;. ipclass1 is a class that incorporates all ports on the switch. PLEASE DO NOT REMOVE THIS CLASS! We then enter the key statement police 100000000 800000 exceed-action drop. This tells the switch that this policy-map is to limit the bandwidth to 100Mbps, with a bucket burst of 80000, and if the rate is exceeded packets are to be dropped. We then exit policy-map configuration and enter interface gigabitEthernet0/3. This is the port that is connected to hcl03. We then attach the policy-map &#8220;example&#8221; to this interface. We then do the same for port gigabitEthernet0/4 which is connected to hcl04. The policy-map is then viewed using the show command to ensure that it is correct. </font></p> <p><font size="4">This example demonstrates an important fact about the cluster. hcl01&#8217;s eth0 (192.168.21.3) is connected to port gigabitEthernet0/1 on switch1. Similarly, hclx&#8217;s eth0 (192.168.21.x+2) is connected to gigabitEthernet0.x on switch1.<br> Further, hcl01&#8217;s eth1 (192.168.21.103) is connected to gigabitEthernet0/1 on switch2. Similarly, hclx&#8217;s eth1 (192.168.21.100+x+2) is connected to gigabitEthernet0/x on switch2. <br> The table below shows this comprehensively. </font></p> <div align="center"> <table width="53%" border="1" cellpadding="0"> <tr> <td width="20%"><strong>Machine</strong></td> <td width="23%"><strong>IP Address</strong></td> <td width="26%"><strong>Switch1 Port</strong></td> <td width="31%"><strong>Switch2 Port</strong></td> </tr> <tr> <td>hcl01</td> <td>192.168.21.3</td> <td>gigabitEthernet0/1</td> <td>N/A</td> </tr> <tr> <td>hcl01_eth1</td> <td>192.168.21.103</td> <td>N/A</td> <td>gigabitEthernet0/1</td> </tr> <tr> <td>hcl02</td> <td>192.168.21.4</td> <td>gigabitEthernet0/2</td> <td>N/A</td> </tr> <tr> <td>hcl02_eth1</td> <td>192.168.21.104</td> <td>N/A</td> <td>gigabitEthernet0/2</td> </tr> <tr> <td>hcl03</td> <td>192.168.21.5</td> <td>gigabitEthernet0/3</td> <td>N/A</td> </tr> <tr> <td>hcl03_eth1</td> <td><div align="left">192.168.21.105</div></td> <td>N/A</td> <td>gigabitEthernet0/3</td> </tr> <tr> <td>hcl04</td> <td>192.168.21.6</td> <td>gigabitEthernet0/4</td> <td>N/A</td> </tr> <tr> <td>hcl04_eth1</td> <td>192.168.21.106</td> <td>N/A</td> <td>gigabitEthernet0/4</td> </tr> <tr> <td>hcl05</td> <td>192.168.21.7</td> <td>gigabitEthernet0/5</td> <td>N/A</td> </tr> <tr> <td>hcl05_eth1</td> <td>192.168.21.107</td> <td>N/A</td> <td>gigabitEthernet0/5</td> </tr> <tr> <td>hcl06</td> <td>192.168.21.8</td> <td>gigabitEthernet0/6</td> <td>N/A</td> </tr> <tr> <td>hcl06_eth1</td> <td>192.168.21.108</td> <td>N/A</td> <td>gigabitEthernet0/6</td> </tr> <tr> <td>hcl07</td> <td>192.168.21.9</td> <td>gigabitEthernet0/7</td> <td>N/A</td> </tr> <tr> <td>hcl07_eth1</td> <td>192.168.21.109</td> <td>N/A</td> <td>gigabitEthernet0/7</td> </tr> <tr> <td>hcl08</td> <td>192.168.21.10</td> <td>gigabitEthernet0/8</td> <td>N/A</td> </tr> <tr> <td>hcl08_eth1</td> <td>192.168.21.110</td> <td>N/A</td> <td>gigabitEthernet0/8</td> </tr> <tr> <td>hcl09</td> <td>192.168.21.11</td> <td>gigabitEthernet0/9</td> <td>N/A</td> </tr> <tr> <td>hcl09_eth1</td> <td>192.168.21.111</td> <td>N/A</td> <td>gigabitEthernet0/9</td> </tr> <tr> <td>hcl10</td> <td>192.168.21.12</td> <td>gigabitEthernet0/10</td> <td>N/A</td> </tr> <tr> <td>hcl10_eth1</td> <td>192.168.21.112</td> <td>N/A</td> <td>gigabitEthernet0/10</td> </tr> <tr> <td>hcl11</td> <td>192.168.21.13</td> <td>gigabitEthernet0/11</td> <td>N/A</td> </tr> <tr> <td>hcl11_eth1</td> <td>192.168.21.113</td> <td>N/A</td> <td>gigabitEthernet0/11</td> </tr> <tr> <td>hcl12</td> <td>192.168.21.14</td> <td>gigabitEthernet0/12</td> <td>N/A</td> </tr> <tr> <td>hcl12_eth1</td> <td>192.168.21.114</td> <td>N/A</td> <td>gigabitEthernet0/12</td> </tr> <tr> <td>hcl13</td> <td>192.168.21.15</td> <td>gigabitEthernet0/13</td> <td>N/A</td> </tr> <tr> <td>hcl13_eth1</td> <td>192.168.21.115</td> <td>N/A</td> <td>gigabitEthernet0/13</td> </tr> <tr> <td>hcl14</td> <td>192.168.21.16</td> <td>gigabitEthernet0/14</td> <td>N/A</td> </tr> <tr> <td>hcl14_eth1</td> <td>192.168.21.116</td> <td>N/A</td> <td>gigabitEthernet0/14</td> </tr> <tr> <td>hcl15</td> <td>192.168.21.17</td> <td>gigabitEthernet0/15</td> <td>N/A</td> </tr> <tr> <td>hcl15_eth1</td> <td>192.168.21.117</td> <td>N/A</td> <td>gigabitEthernet0/15</td> </tr> <tr> <td>hcl16</td> <td>192.168.21.18</td> <td>gigabitEthernet0/16</td> <td>N/A</td> </tr> <tr> <td>hcl16_eth1</td> <td>192.168.21.118</td> <td>N/A</td> <td>gigabitEthernet0/16</td> </tr> </table> </div> <p></p> <p> </p> <p><font size="4">Limiting the bandwidth between the two switches is similar to the above example, but on switch1, Interface gigabitEthernet0/25 should be assigned the desired policy-map. Then on switch2, the same should be done. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Simulating Two Clusters</strong></font></p> <p><font size="4">If you want to simulate two clusters with the cluster, the following example should be considered.</font></p> <p><font size="4">Say we want Cluster A to be comprised of hcl01 and hcl02, and Cluster B, to comprised of hcl03 and hcl04. We want hcl01 and hcl02 to &#8220;talk&#8221; to each other at 10Mbps, and hcl03 and hcl04 to &#8220;talk&#8221; at 1Gbps. Additionally, we want to restrict the link between Clusters A and B to a bandwidth of 250Mbps. <br> To accomplish this we perform the following steps:</font></p> <p><font size="4">1.) Log onto hcl01 and make sure that eth0 is active and eth1 is not. This means that the machine is connected to switch1.</font></p> <p><font size="4">2.) Do the same for hcl02 </font></p> <p><font size="4">3.) Log onto switch1, and create a policy-map to limit the bandwidth to 10Mbps (10000000bps). Attach this policy-map to gigabitEthernet0/1 and gigabitEthernet0/2. </font></p> <p><font size="4">4.) Log onto hcl03 and perform the following steps:</font></p> <p><font size="4"> hcl03 $&gt; /sbin/ifconfig <br> This will list the active network devices. If the user before you cleaned up after him/her self, only &#8220;lo&#8221; (the loopback interface) and &#8220;eth0&#8221; should be listed. <br> hcl03 $&gt; /sbin/ifup eth1 (debian command is sudo /sbin/ifup eth1)<br> hcl03 $&gt; /sbin/ifdown eth0 (debian command is sudo /sbin/ifdown eth0)</font></p> <p><font size="4">This connects hcl03 to switch2, as eth1 is connected to switch2. Then hcl03 is disconnected from switch1, as eth0 is connected to switch1. MAKE SURE YOU ALWAYS BRING ONE INTERFACE UP BEFORE YOU BRING ANOTHER DOWN. OTHERWISE THE MACHINE WILL BE ISOLATED WITH NO ACTIVE NETWORK DEVICES. To see what devices are currently active, use the /sbin/ifconfig command. </font></p> <p><font size="4">5.) The same should be done for hcl04. </font></p> <p><font size="4">6.) Since we want these machines to talk at 1Gbps, we should log onto switch2 and make sure that there are no policy-maps existing. </font></p> <p><font size="4">7.) Log onto switch1 and create a new policy-map, limit it to 250000000bps and attach it to gigabitEthernet0/25. Do the same for switch2.</font></p> <p><font size="4">Done!</font></p> <p> </p> <p><font size="4">The following example shows how to delete a policy-map. YOU SHOULD ALWAYS REMEMBER TO DELETE YOUR POLICY-MAPS WHEN YOUR JOBS ARE DONE SO OTHER USER&#8217;S JOBS DON&#8217;T GET MESSED UP! </font></p> <p><font size="4">--------------------------------------------------------------<br> hcl0x $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification<br> Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> Password:<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#config t<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#no policy-map example<br> hclswitch1(config)#exit<br> hclswitch1#show policy-map<br> <br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl0x $&gt;<br> ---------------------------------------------------------------</font></p> 36c402fb026eaa2b1ff91f349e7bd235279efd40 Main Page 0 1 469 404 2010-08-18T11:16:00Z Root 1 /* Mathematics */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] * [[The MPI LogP Benchmark|logp_mpi]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] c22fa93bd0fd1c600d8d949291b2f6621e9cc51e 473 469 2010-08-31T16:53:49Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 93c4a3c08016b040b2cb92db323671db91d06f8e 474 473 2010-08-31T17:00:47Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * [http://hcl.ucd.ie/project/mpC mpC] * [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] * [http://hcl.ucd.ie/project/libELC libELC] * [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] * [http://hcl.ucd.ie/project/fupermod FuPerMod] * [http://hcl.ucd.ie/project/pmm PMM] * [http://hcl.ucd.ie/project/cpm CPM] * [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] bf0bac7c1784536dbfc7fbd37f13cfc29c3b86aa 475 474 2010-08-31T17:12:34Z Root 1 /* HCL software for heterogeneous computing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] * [http://hcl.ucd.ie/project/libELC libELC] * [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 8535390b259c67648c05b06e6a3fc2ddcf8dee84 476 475 2010-08-31T17:12:54Z Root 1 /* HCL software for heterogeneous computing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 101706ab3dac610ad49170d8c842f800d5ad8374 477 476 2010-09-01T10:55:17Z Root 1 /* HCL software for heterogeneous computing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] c8a029a106d3f7cae61d847a0ff8092b579ee225 CPM 0 69 478 2010-09-02T11:25:28Z Kiril 3 New page: ToDo for CPM: * implement Traeff minimum/maximum decision making * implement mapping of processes based on sorting so that in a reserved set of resources, process X always refers to the e... wikitext text/x-wiki ToDo for CPM: * implement Traeff minimum/maximum decision making * implement mapping of processes based on sorting so that in a reserved set of resources, process X always refers to the exact same node Y a82f748ce61ed95a4d05f446e1846e9db6f64f74 HCL cluster 0 5 485 484 2010-09-16T18:23:32Z Davepc 2 /* Detailed Cluster Specification */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio ea623f79fa94bb4855c0750bf6c1cc22592c8102 490 485 2010-09-23T17:05:53Z Davepc 2 wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot e23d4708fbb5e21c071c457b286b1382c1afdb5f 510 490 2010-10-12T19:28:50Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[new hcl node install & configuration log]] [[new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 1c83368f5294f861eded6e82978a32c3a54f38f5 518 510 2010-10-15T14:05:12Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[/new hcl node install & configuration log]] [[/new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot edabf79cb57f4f5636e1a3e83ce5123ca899b914 521 518 2010-10-15T14:10:55Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/heterogeneous.ucd.ie_install_log|new hcl node install & configuration log]] [[/new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 17f46d0d97952e4e1564e9738e1d5676b45e46bb 525 521 2010-10-15T14:14:44Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 7039cdf54d0edae8ce3ed32b6fb2e771035feadf 534 525 2010-10-18T18:01:28Z Davepc 2 /* Access and Security */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 2d9e81135918422240f4e982d3fbea33b5a3bc21 SSH 0 36 486 483 2010-09-16T18:24:58Z Davepc 2 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_to_the_nodes_is_controlled_by_Torque_PBS.]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 85a7895a58083ee131b21366ffeeaa2348bcdbf0 487 486 2010-09-16T18:25:32Z Davepc 2 /* Making a cascade of SSH connections easy */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> 8884a508c72f874ed321fb372e9a27b2d7b57dcc Main Page 0 1 488 477 2010-09-17T15:31:14Z Root 1 /* Data processing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] f78f3003e3879f2b04c2e7f141481a582f8bd7a6 G3DViewer 0 70 489 2010-09-17T15:34:03Z Root 1 New page: Simple 3D viewer, supports VRML, 3D Studio, AutoCAD and other formats * http://automagically.de/g3dviewer/ * Debian package g3dviewer wikitext text/x-wiki Simple 3D viewer, supports VRML, 3D Studio, AutoCAD and other formats * http://automagically.de/g3dviewer/ * Debian package g3dviewer e0b2e43accbdd72fee5da546840da2ae3538274d HCL Cluster Specifications 0 50 491 472 2010-09-23T17:58:12Z Davepc 2 /* Cluster Specifications */ wikitext text/x-wiki ==Cluster Specifications== {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB (4x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB (2x512) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB (1x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB (1x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== ====Cluster Performance==== ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Number of processors = 18 Array size = 2000000 Offset = 0 The total memory requirement is 824.0 MB ( 45.8MB/task) You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity appears to be less than one microsecond Your clock granularity/precision appears to be 1 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 24589.1750 0.0235 0.0234 0.0237 Scale: 24493.9786 0.0237 0.0235 0.0245 Add: 27594.1797 0.0314 0.0313 0.0315 Triad: 27695.7938 0.0313 0.0312 0.0315 ----------------------------------------------- Solution Validates! ----------------------------------------------- ====Individual Node Performance==== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- a0cfa9d12b9f636f0bc0cc8dabf931b2467a6cb8 HCL cluster/heterogeneous.ucd.ie install log 0 48 492 371 2010-10-07T17:37:46Z Davepc 2 /* Install NIS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/defaults/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> c6bcf9112e21955e91b9197a1a44d9a01f6b25bb 493 492 2010-10-07T17:44:02Z Davepc 2 /* Install NIS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.21.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 60f22c2f2d72eea4618a00f44feb2152e0d7d1ab 494 493 2010-10-07T17:47:37Z Davepc 2 /* Install NIS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>db.heterogneneous.ucd.ie</code> and the reverse maps <code>db.192.168.21</code> & <code>db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> b202e6a6c19a37b85d85da93a4b59df734a22746 495 494 2010-10-07T18:16:18Z Davepc 2 /* DNS / BIND */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 842bd45ed9417e021cf75ddaf104f48907b71a57 496 495 2010-10-07T19:46:16Z Davepc 2 /* Installing Ganglia Frontend */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> cb01778849cb42aef4b86302f170924fb553243c 499 496 2010-10-07T20:20:55Z Davepc 2 wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 0a58b490b44c09b4770029970adf10ddfd2330ae 502 499 2010-10-12T17:36:30Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 926e8a6366efd64ed2cd366811e3c6db26e4c4f4 503 502 2010-10-12T17:46:37Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_mom defaults ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 36dd5e870ec544c351ff52081c1d70fb0f7d049b 504 503 2010-10-12T17:46:50Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> b7c0a044a33b5138a655200899ceb6052e2a8e12 505 504 2010-10-12T17:52:17Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 9b2f6408791d51aec2453889bd1fed203b442a0d 506 505 2010-10-12T17:57:18Z Davepc 2 /* Queue Setup */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> e32329caba738083b88ab60d293c77d5a1558fef 507 506 2010-10-12T18:17:09Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [here]http://www.clusterresources.com/product/maui/ Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> a6387b4dd72d757edc90825411e091e5771ba8eb 508 507 2010-10-12T18:17:50Z Davepc 2 /* Maui Scheduler */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [here http://www.clusterresources.com/product/maui/] Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 1c529f926bb80dc79409e3a1df7cb76b841d65be 509 508 2010-10-12T18:43:18Z Davepc 2 /* Maui Scheduler */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> c747e234e13f07954bd18d9ec39cb4e15c45c706 515 509 2010-10-15T13:58:53Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> f838d2d6e44a5e8dff6b1dd9949764b9e39d7755 516 515 2010-10-15T13:59:12Z Davepc 2 /* =Packages for nodes */ wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 157fd1f50917d5bc3fa62dcc1f5cd75ad03db5e0 519 516 2010-10-15T14:09:56Z Davepc 2 [[New heterogeneous.ucd.ie install log]] moved to [[HCL cluster/heterogeneous.ucd.ie install log]]: Making it a subpage of hcl_cluster wikitext text/x-wiki * Basic installation of Debian Squeeze ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 157fd1f50917d5bc3fa62dcc1f5cd75ad03db5e0 528 519 2010-10-15T14:23:27Z Davepc 2 wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address 192.168.21.254 netmask 255.255.255.0 gateway 192.168.21.1 iface eth1 inet static address 193.1.132.124 netmask 255.255.252.0 gateway 193.1.132.1 </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 9feb04f16e5b26b5d8e8088a26984c3514605363 529 528 2010-10-18T13:08:52Z Davepc 2 /* Interfaces */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. First set resolv.conf: <source lang="text"> nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 domain ucd.ie search ucd.ie </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 08d04b528c5b6b98ba749da42f7efbd179df0549 530 529 2010-10-18T13:15:30Z Davepc 2 /* DNS / BIND */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here]. Now edit the maui configuration for these queues: <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMPOLLINTERVAL 00:00:01 DEFERTIME 0 DEFERCOUNT 86400 PREEMPTIONPOLICY REQUEUE QUEUETIMEWEIGHT 1 QOSWEIGHT 1 SYSCFG QLIST=normal,lowpri,service,volunteer QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[service] QFLAGS=PREEMPTEE,PREEMPTOR QOSCFG[volunteer] QFLAGS=PREEMPTEE CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=0 </source> 2a35256c774ceb5b1a581425f9d044776aca6bf4 531 530 2010-10-18T15:56:41Z Davepc 2 /* Maui Scheduler */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> 5e8d21dbd826696169a7c00b3b8f961a210c194d 532 531 2010-10-18T16:00:16Z Davepc 2 /* Maui Scheduler */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> edit /etc/profile <source lang="text". if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/maui/bin:/usr/local/maui/sbin" else PATH="/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/maui/bin" fi </source> 95e0578bc14965ea60f1fd8cb97be7df89e6f093 533 532 2010-10-18T16:00:45Z Davepc 2 /* Maui Scheduler */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> edit /etc/profile <source lang="text"> if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/maui/bin:/usr/local/maui/sbin" else PATH="/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/maui/bin" fi </source> 9adbf1dbbcdd767ea89742f989420417b414efb4 HCL cluster/hcl node install configuration log 0 49 497 376 2010-10-07T20:15:06Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.21.3:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> d3365f00feb1170f8c28349b7dbaa6fb77592d8c 498 497 2010-10-07T20:19:00Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 94b11080cea2238971c5fb7896bd0677fb05e92c 500 498 2010-10-12T16:35:12Z Davepc 2 /* General Installation */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install. ctags expect ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 51bc4f809ab3f68adac96cae3865f979ad58c025 501 500 2010-10-12T17:06:16Z Davepc 2 wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 94b11080cea2238971c5fb7896bd0677fb05e92c 511 501 2010-10-15T12:50:59Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> a2370f6aafd41ae499b9a27b4fb5a020d74748dd 512 511 2010-10-15T12:54:22Z Davepc 2 /* NIS Client */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now...) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> c3f6e3434d38715aa53c16cb2cff8a0197468948 513 512 2010-10-15T12:54:50Z Davepc 2 /* NIS Client */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 1bed207c1c6d993c174dd9adec6223c43a9b1cff 514 513 2010-10-15T12:55:42Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 94312d20687950bbf941adfc4be0978d3bdc7cd2 517 514 2010-10-15T13:59:46Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [New_heterogeneous.ucd.ie_install_log#Packages_for_nodes] ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 1bbcb8f48401a9a967bc08736b3f86012e5d8e5f 522 517 2010-10-15T14:11:37Z Davepc 2 /* Torque PBS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]] ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 8e0f74d8036b6fe484333a695eff9a95f3dbb7bb 523 522 2010-10-15T14:13:04Z Davepc 2 [[New hcl node install & configuration log]] moved to [[HCL cluster/hcl node install configuration log]]: Making it a subpage of HCL_cluster wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]] ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 8e0f74d8036b6fe484333a695eff9a95f3dbb7bb 526 523 2010-10-15T14:18:03Z Davepc 2 /* Torque PBS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> c6e6ebccd0a56c8c85808ed7634461c26056e772 527 526 2010-10-15T14:21:13Z Davepc 2 wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line 192.168.20.254:/home /home nfs soft,retrans=6 0 0 to <code>/etc/fstab</code> Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> 9a4fa4f87973e540ff883ae0a0e7193cb39ca077 New heterogeneous.ucd.ie install log 0 71 520 2010-10-15T14:09:56Z Davepc 2 [[New heterogeneous.ucd.ie install log]] moved to [[HCL cluster/heterogeneous.ucd.ie install log]]: Making it a subpage of hcl_cluster wikitext text/x-wiki #REDIRECT [[HCL cluster/heterogeneous.ucd.ie install log]] 5da9dd7333249fb027e10dc1fb53d7bb83bae1c7 New hcl node install & configuration log 0 72 524 2010-10-15T14:13:04Z Davepc 2 [[New hcl node install & configuration log]] moved to [[HCL cluster/hcl node install configuration log]]: Making it a subpage of HCL_cluster wikitext text/x-wiki #REDIRECT [[HCL cluster/hcl node install configuration log]] 8da30d80ee324f1d9415e1d1cb042e469f01309c HCL cluster 0 5 535 534 2010-10-18T18:01:52Z Davepc 2 /* Creating new user accounts */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `cat /etc/dsh/machines.list`; do root_ssh $i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `cat /etc/dsh/machines.list`; do screen -L -d -m root_ssh $i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot e23bff791388fcbd042dbb2dca0ac8fc3246d902 544 535 2010-11-15T19:07:18Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade'; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 2fae8bf4e7d3cbf8a3cb65d2fdcbfbe9875880db 547 544 2010-12-01T15:32:23Z Rhiggins 4 /* Useful Tools */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 3c8ca3518a64659446e2bf001579ec355db95d44 HCL cluster/heterogeneous.ucd.ie install log 0 48 536 533 2010-10-18T19:25:13Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Torque is now in the Debian repository, but we are using the newer version and compiling from sources. But apt-get installs the libtorque2 with open-MPI, this leads to two versions of the library; in /usr/lib and /usr/local/lib. As a quick fix we have changed the link in /usr/lib libtorque.so.2 -> libtorque.so.2.0.0 to libtorque.so.2 -> /usr/local/lib/libtorque.so.2 '''Note: This may well break when debian updates its packages''' Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> edit /etc/profile <source lang="text"> if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/maui/bin:/usr/local/maui/sbin" else PATH="/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/maui/bin" fi </source> bf0ede3643ad083f33080f933a566fcc06821ea5 537 536 2010-10-18T20:25:24Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Torque is now in the Debian repository, but we are using the newer version and compiling from sources. But apt-get installs the libtorque2 with open-MPI, this leads to two versions of the library; in /usr/lib and /usr/local/lib. If you manually set LD_LIBRARY_PATH just make sure that /usr/local/lib is first. Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> edit /etc/profile <source lang="text"> if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/maui/bin:/usr/local/maui/sbin" else PATH="/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/maui/bin" fi </source> 7f3831ddd5af48a0c11540ddac4b9a6733e937da 539 537 2010-10-26T15:37:57Z Davepc 2 /* Torque - PBS */ wikitext text/x-wiki * Basic installation of Debian Squeeze See also [[HCL_cluster/hcl_node_install_configuration_log]] ==Networking== ===Interfaces=== * edit <code>/etc/networks/interfaces</code> Note that at some point eth1 should be configured by DHCP, it is on the UCD LAN and must be registered correctly (update MAC address with services). <code>eth0</code> is the internal network. <source lang="text"> auto lo eth0 eth1 eth2 iface lo inet loopback iface eth0 inet static address 192.168.20.254 netmask 255.255.255.0 iface eth1 inet static address 192.168.21.254 netmask 255.255.255.0 iface eth2 inet dhcp </source> * Install non-free linux firmware for network interface (eth0). This will allow Gigabit operation on eth0 with the tg3 hardware (I think). Edit <code>/etc/apt/sources.list</code> including the lines: <source lang="text">deb http://ftp.ie.debian.org/debian/ squeeze main contrib non-free deb-src http://ftp.ie.debian.org/debian/ squeeze main contrib non-free</source> * Install firmware-linux: <source lang="text">apt-get update && apt-get install firmware-linux</source> You probably need to reboot now. ===DNS / BIND=== We will run our own DNS server for the cluster. Because dhclient is running on eth2, we need to stop it breaking resolv.conf. Add to <code>/etc/dhcp/dhclient.conf</code> <source lang="text"> supersede domain-name "heterogeneous.ucd.ie ucd.ie"; prepend domain-name-servers 127.0.0.1; </source> After running dhclient, resolv.conf should read: <source lang="text"> domain heterogeneous.ucd.ie search heterogeneous.ucd.ie ucd.ie nameserver 127.0.0.1 nameserver 137.43.116.19 nameserver 137.43.116.17 nameserver 137.43.105.22 </source> Now install bind9 (<code>apt-get install bind9</code>). Edit <code>/etc/bind/named.conf.local</code> and set the domain zones for the cluster (forwards and reverse). We have two subdomains where reverse lookups will have to be specified 192.168.20 and 192.168.21 <source lang="text"> // // Do any local configuration here // // Consider adding the 1918 zones here, if they are not used in your // organization //include "/etc/bind/zones.rfc1918"; include "/etc/bind/rndc.key"; controls { inet 127.0.0.1 allow { localhost; } keys { "rndc-key"; }; }; zone "heterogeneous.ucd.ie" { type master; file "db.heterogeneous.ucd.ie"; }; zone "21.168.192.in-addr.arpa" { type master; file "db.192.168.21"; }; zone "20.168.192.in-addr.arpa" { type master; file "db.192.168.20"; }; </source> Also edit the options file: <code>/etc/bind/named.conf.options</code>, note the subnet we define in the allow sections, 192.168.20/23, it will permit access from 192.168.20.* and 192.168.21.* addresses. <source lang="text"> options { directory "/var/cache/bind"; // If there is a firewall between you and nameservers you want // to talk to, you may need to fix the firewall to allow multiple // ports to talk. See http://www.kb.cert.org/vuls/id/800113 // If your ISP provided one or more IP addresses for stable // nameservers, you probably want to use them as forwarders. // Uncomment the following block, and insert the addresses replacing // the all-0's placeholder. forwarders { 137.43.116.19; 137.43.116.17; 137.43.105.22; }; recursion yes; version "REFUSED"; allow-recursion { 127.0.0.1; 192.168.20.0/23; }; allow-query { 127.0.0.1; 192.168.20.0/23; }; auth-nxdomain no; # conform to RFC1035 listen-on-v6 { any; }; }; </source> Now work on the zone files specified <code>/var/cache/bind/db.heterogneneous.ucd.ie</code> and the reverse maps <code>/var/cache/bind/db.192.168.21</code> & <code>/var/cache/bind/db.192.168.21</code> Populate them with all nodes of the cluster. ===IP Tables=== * Set up <code>iptables</code>. We want to implement NAT between the internal network (<code>eth0</code>) and external one (<code>eth1</code>). Add a script to <code>/etc/network/if-up.d</code> directory, named <code>00iptables</code>. All scripts in this directory will be executed after network interfaces are brought up, so this will persist: <source lang="bash"> #!/bin/sh PATH=/usr/sbin:/sbin:/bin:/usr/bin IF_INT=eth0 IF_EXT=eth1 # # delete all existing rules. # iptables -F iptables -t nat -F iptables -t mangle -F iptables -X # Always accept loopback traffic iptables -A INPUT -i lo -j ACCEPT # Allow established connections, and those not coming from the outside iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -m state --state NEW ! -i $IF_EXT -j ACCEPT iptables -A FORWARD -i $IF_EXT -o $IF_INT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow outgoing connections from the LAN side. iptables -A FORWARD -i $IF_INT -o $IF_EXT -j ACCEPT # Masquerade. iptables -t nat -A POSTROUTING -o $IF_EXT -j MASQUERADE # Don't forward from the outside to the inside. iptables -A FORWARD -i $IF_EXT -o $IF_EXT -j REJECT # Enable routing. echo 1 > /proc/sys/net/ipv4/ip_forward </source> ==NFS== Install: apt-get install nfs-kernel-server nfs-common portmap Add to /etc/exports /home 192.168.20.0/255.255.254.0(rw,sync,no_root_squash,no_subtree_check) Restart nfs server with exportfs -a ==Clonezilla== Firstly, Clonezilla is probably going to pollute a lot of your server configuration when it sets itself up. Be prepared to loose your IPtables configuration, NFS (if any) and DHCP settings. Maybe more. * follow the guide to installing Clonezilla [http://www.howtoforge.com/cloning-linux-systems-with-clonezilla-server-edition-clonezilla-se-p2 here]. Essentially: ** add repository key <source lang="text">wget -q http://drbl.sourceforge.net/GPG-KEY-DRBL -O- | apt-key add -</source> ** the line add to /etc/apt/sources.list: <source lang="text">deb http://drbl.sourceforge.net/drbl-core drbl stable</source> ** run: <source lang="text">apt-get update && apt-get install drbl && /opt/drbl/sbin/drbl4imp</source> ** accept default options to drbl4imp. * After Clonezilla has installed edit <code>/etc/dhcpd3/dhcpd.conf</code>, adding all entries for test nodes <code>hcl07</code> and <code>hcl03</code>. Also ensure these nodes have been removed from the inplace heterogeneous.ucd.ie server so that they are only served by one machine. <source lang="text"> default-lease-time 300; max-lease-time 300; option subnet-mask 255.255.255.0; option domain-name-servers 137.43.116.19,137.43.116.17,137.43.105.22; option domain-name "ucd.ie"; ddns-update-style none; # brett had ad-hoc ...? server-name drbl; filename = "pxelinux.0"; subnet 192.168.21.0 netmask 255.255.255.0 { option subnet-mask 255.255.255.0; option routers 192.168.21.1; next-server 192.168.21.254; pool { # allow members of "DRBL-Client"; range 192.168.21.200 192.168.21.212; } host hcl03 { option host-name "hcl03.ucd.ie"; hardware ethernet 00:14:22:0A:22:6C; fixed-address 192.168.21.5; } host hcl03_eth1 { option host-name "hcl03_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:22:6D; fixed-address 192.168.21.105; } host hcl07 { option host-name "hcl07.ucd.ie"; hardware ethernet 00:14:22:0A:20:E2; fixed-address 192.168.21.9; } host hcl07_eth1 { option host-name "hcl07_eth1.ucd.ie"; hardware ethernet 00:14:22:0A:20:E3; fixed-address 192.168.21.109; } default-lease-time 21600; max-lease-time 43200; } </source> ==NTP Daemon== Nodes in the cluster should use the Network Time Protocol to set their clocks, heterogeneous.ucd.ie should provide them that service. Install ntpd with the following command: apt-get install ntp Configure the daemon with the following line in <code>/etc/ntp.conf</code>: restrict 192.168.20.0 mask 255.255.254.0 nomodify notrap ==Install DHCP== Install the DHCP server package with <code>apt-get install dhcp3-server</code>. When you install Clonezilla it will probably pollute your DHCP server setup, so make ... TODO ==Install NIS== Copy users from <code>passwd</code>, <code>groups</code> and <code>shadow</code> from <code>/etc</code> on <code>hcl01</code>. Install nis. Set domain to heterogeneous.ucd.ie Edit <code>/etc/defaultdomain</code> so that it contains: heterogeneous.ucd.ie Edit <code>/etc/default/nis</code> so that it contains: # Are we a NIS server and if so what kind (values: false, slave, master) NISSERVER=master Edit <code>/etc/ypserv.securenets</code> so that is contains: # allow connects from local 255.0.0.0 127.0.0.0 # allow connections from heterogeneous subnets .20 and .21 255.255.254.0 192.168.20.0 The NIS host is also a client of itself, so do the client set up as follows: Edit <code>/etc/hosts</code> end ensure the NIS Master is listed 192.168.20.254 heterogeneous.ucd.ie heterogeneous Edit <code>/etc/yp.conf</code> and ensure that it contains: domain heterogeneous.ucd.ie server localhost Edit <code>/etc/passwd</code> adding a line to the end that reads: <code>+::::::</code>. Edit <code>/etc/group</code> with a line <code>+:::</code> at the line. The NIS Makefile will not pull userid and groupids that are lower than a certain value, we must set this to 500 in <code>/var/yp/Makefile</code> MINUID=500 MINGID=500 Start the <code>ypbind</code> and <code>yppasswd</code> services. Then initialise the NIS database: /usr/lib/yp/ypinit -m Accept defaults at prompts. Now start other NIS services service nis start ==Installing Ganglia Frontend== Install the packages gmetad and ganglia-webfrontend. Configure the front end by appending to <code>/etc/apache2/apache2.conf</code>, the following: Include /etc/ganglia-webfrontend/apache.conf Configure gmetad by adding to the <code>/etc/ganglia/gmetad.conf</code>, the following line: data_source "HCL Cluster" 192.168.20.1 192.168.20.16 data_source "HCL Service" localhost:8648 This means that the gmetad collector connect to hcl01 and hcl16 on the .20 subnet to gather data for the frontend to use. After all packages are configured execute: <source lang="text"> service apache2 restart service gmetad restart </source> Pointing your browser to [http://heterogeneous.ucd.ie/ganglia/index.php here] should display the monitoring page for HCL Cluster. <code>gmond</code> must also be installed and configured on the cluster nodes. =Hardware Monitoring & Backup= ==Disk Monitoring== Install smartmontools as per [http://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu here]. Briefly: apt-get install smartmontools Edit <code>/etc/defaults/smartmontools</code> so that it contains: <source lang="text"> # List of devices you want to explicitly enable S.M.A.R.T. for # Not needed (and not recommended) if the device is monitored by smartd enable_smart="/dev/sda" # uncomment to start smartd on system startup start_smartd=yes # uncomment to pass additional options to smartd on startup smartd_opts="--interval=1800" </source> Open <code>/etc/smartd.conf</code> and edit the first line that begins with DEVICESCAN (all lines after the first instance of DEVICESCAN are ignored). Have it read something like: DEVICESCAN -d removable -n standby -m root -m robert_higgins@iol.ie -M exec /usr/share/smartmontools/smartd-runner Then start the service <code>/etc/init.d/smartmontools start</code> Note, consider installing this on all nodes, as it would be interesting to have prior notice of any failing disks. = Torque - PBS = Torque is now in the Debian repository, but we are using the newer version and compiling from sources. But apt-get installs the libtorque2 with open-MPI, this leads to two versions of the library; in /usr/lib and /usr/local/lib. If you manually set LD_LIBRARY_PATH just make sure that /usr/local/lib is first. Download Torque from [http://www.clusterresources.com/pages/products/torque-resource-manager.php here]. Extract archive and configure: ./configure --prefix=/usr/local Run <code>make && make install</code>. Torque will have installed files under <code>/usr/local</code> and <code>/var/spool/torque</code>. Edit the Torque config file <code>/var/spool/torque/torque.cfg</code> adding the line: SERVERHOST=192.168.20.254 Add the compute nodes of the cluster to the file <code>/var/spool/torque/server_priv/nodes</code> <source lang="text"> hcl01 hcl02 hcl03 hcl04 hcl05 hcl06 hcl07 hcl08 hcl09 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 </source> By default /usr/local/lib may not be in the list of directories to search for dynamic-linked libraries. If so, add that path to the end of /etc/ld.so.conf.d/local.conf, and run "ldconfig" to update. Now return to the extracted source distribution, find the <code>contrib/init.d</code> folder and copy the <code>debian.pbs_server</code> to <code>/etc/init.d/pbs_server</code>. Edit the inplace file so that the header reads: <source lang="bash"> #!/bin/sh ### BEGIN INIT INFO # Provides: pbs_server # Required-Start: $local_fs $named # Should-Start: # Required-Stop: </source> This will ensure that important services are started before the PBS/Torque daemon. then run update-rc.d pbs_server defaults pbs_server -t create qterm pbs_server ===Packages for nodes=== make packages scp torque-package-mom-linux-i686.sh hclxx: scp torque-package-clients-linux-i686.sh hclxx: ===Queues=== We will configure 4 queues, with varying priority and preemption policies. *The first is the <b>normal</b> queue. This will be where most jobs, which should not be interrupted, will be placed. It has the highest priority. *The second queue is <b>lowpri</b>, it is for running jobs which may run for extended periods but are interruptible. Should a job <i>A</i> be placed on the normal queue while a lowpri job <i>B</i> is running, <i>B</i> will be sent a kill signal, so it may shut down cleanly, and then it will be requeued on the system, so it can resume running after <i>A</i>b has finished. *The third queue is for running of service jobs like the homedir backup. It is named <b>service</b>. This will have lower priority than the above queues and jobs running on this queue will be preemptable. *The final queue is the <b>volunteer</b> queue. This has the lowest priority and is for executing volunteer computing jobs during otherwise idle periods. These jobs will also be preemptable. ====Queue Setup==== Enter the queue manager program <code>qmgr</code> as the root user on heterogeneous. Send it the following commands: Allow all users to see all queued jobs: set server query_other_jobs=TRUE' Create the default <b>normal</b> queue create queue normal set queue normal queue_type = Execution set queue normal Priority = 10000 set queue normal enabled = True set queue normal started = True Create <b>lowpri</b> queue create queue lowpri set queue lowpri queue_type = Execution set queue lowpri Priority = 1000 set queue lowpri enabled = True set queue lowpri started = True Create <b>service</b> queue create queue service set queue service queue_type = Execution set queue service Priority = 100 set queue service enabled = True set queue service started = True Create <b>volunteer</b> queue create queue volunteer set queue volunteer queue_type = Execution set queue volunteer Priority = 0 set queue volunteer enabled = True set queue volunteer started = True Set some server settings set server default_queue=normal set server resources_default.nodes = 1 set server resources_default.walltime = 12:00:00 == Misc == Everyone can see every job: in qmgr set server query_other_jobs = True = Maui Scheduler = Download from [http://www.clusterresources.com/product/maui/ here], unpack and run ./configure --with-pbs --with-spooldir=/var/spool/maui Edit /var/spool/maui/maui.cfg <source lang="text"> SERVERHOST heterogeneous ADMIN1 root RMCFG[HETEROGENEOUS] TYPE=PBS AMCFG[bank] TYPE=NONE RMPOLLINTERVAL 00:00:01 QUEUETIMEWEIGHT 0 PREEMPTIONPOLICY REQUEUE DEFERTIME 0 DEFERCOUNT 86400 NODEALLOCATIONPOLICY FIRSTAVAILABLE QOSWEIGHT 1 QOSCFG[normal] QFLAGS=PREEMPTOR QOSCFG[lowpri] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[service] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 QOSCFG[volunteer] QFLAGS=PREEMPTEE:PREEMPTOR QTWEIGHT=0 CREDWEIGHT 1 CLASSWEIGHT 1 CLASSCFG[normal] QDEF=normal PRIORITY=10000 CLASSCFG[lowpri] QDEF=lowpri PRIORITY=1000 CLASSCFG[service] QDEF=service PRIORITY=100 CLASSCFG[volunteer] QDEF=volunteer PRIORITY=1 </source> edit /etc/profile <source lang="text"> if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/maui/bin:/usr/local/maui/sbin" else PATH="/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/maui/bin" fi </source> 44a8cca1aef01e338de4cfff0b61523d76357dd9 HCL cluster/hcl node install configuration log 0 49 538 527 2010-10-21T11:52:31Z Davepc 2 /* NFS */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> fe6ed3c9400eae1d34eae7673f929a6313d45f31 Main Page 0 1 540 488 2010-10-28T14:18:16Z Root 1 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Presentation == * [[Dia]] * [[LaTeX]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] 3614d92676a2dc75b8030cc0e21ae1a4013e5408 563 540 2011-03-11T16:01:08Z Root 1 /* Presentation */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] d6fa4d62545c60b2aaa63a434da682a8598b61d6 567 563 2011-03-18T12:41:19Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Clusters == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 35cb01114abb1a512f32f6a731a7cd7282c5d4ee 581 567 2011-04-06T10:52:11Z Root 1 /* Clusters */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK clusters]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 1d1146e18420d1843160ebc7d4aede8f44bacad8 582 581 2011-04-06T10:53:30Z Root 1 /* Hardware */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 UTK multicores + GPU] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) deeb95d9c25db1a7ae6a0bb6f9c85df06e1cc806 Hwloc 0 73 541 2010-10-28T14:19:10Z Root 1 New page: http://www.open-mpi.org/projects/hwloc/ The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchic... wikitext text/x-wiki http://www.open-mpi.org/projects/hwloc/ The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently. 362f45f9c6148094057bb21c824e55f2b2f72373 Autotools 0 21 542 234 2010-10-28T14:32:43Z Root 1 /* Conditional building */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Tutorials == * http://www.lrde.epita.fr/~adl/autotools.html (very nice set of slides) == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> ** <code>library.h</code> is like a [http://en.wikipedia.org/wiki/Precompiled_header precompiled header] (contains common headers and symbols), which is to be included in most of source files of the library (there is no need in real precompiled headers for small projects in C) For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == MPI support == * Define MPI compilers/linkers in configure.ac <source lang="text"> AC_PROG_CC([mpicc]) AC_PROG_CXX([mpic++ mpicxx]) </source> == C/C++ support == * To check C++ features, switch to C++ language in configure.ac <source lang="text"> AC_LANG_PUSH(C++) AC_CHECK_HEADER([header.hpp]) AC_LANG_POP(C++) </source> * To link C code with C++ libraries, add a non-existent C++ file dummy.cpp to sources in Makefile.am <source lang="text"> bin_PROGRAMS = program program_SOURCES = program.c nodist_EXTRA_program_SOURCES = dummy.cpp program_LDADD = cpplibrary.a </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD export PATH=$HOME/$ARCH/bin:$PATH wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 3b891b274baffc1827ab4e7372488ce1e012a475 589 542 2011-04-14T15:55:44Z Root 1 /* Libraries */ wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Tutorials == * http://www.lrde.epita.fr/~adl/autotools.html (very nice set of slides) == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/benchmarks/Makefile.am == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/collectives/Makefile.am == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. For example, http://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tools/Makefile.am == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == MPI support == * Define MPI compilers/linkers in configure.ac <source lang="text"> AC_PROG_CC([mpicc]) AC_PROG_CXX([mpic++ mpicxx]) </source> == C/C++ support == * To check C++ features, switch to C++ language in configure.ac <source lang="text"> AC_LANG_PUSH(C++) AC_CHECK_HEADER([header.hpp]) AC_LANG_POP(C++) </source> * To link C code with C++ libraries, add a non-existent C++ file dummy.cpp to sources in Makefile.am <source lang="text"> bin_PROGRAMS = program program_SOURCES = program.c nodist_EXTRA_program_SOURCES = dummy.cpp program_LDADD = cpplibrary.a </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD export PATH=$HOME/$ARCH/bin:$PATH wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 12f62b7ab5cb33f1c0d41b0826337e28b646611c SSH 0 36 543 487 2010-11-10T12:41:31Z Zhongziming 5 /* X11 forwarding */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes de37c11b2a1fad01017cef86e122729c5d2395a1 559 543 2011-01-27T16:45:39Z Kiril 3 /* Automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Better than automatically saying "yes" == Remark: It turns out there is a more ellegant way to do this task: using a tool called ''ssh-add''. == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes 140e047c6bc29d8060bcadb733642dcf245f0f68 560 559 2011-01-27T16:46:38Z Kiril 3 /* Better than automatically saying "yes" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Better than automatically saying "yes" == Remark: It turns out there is a more ellegant way to do this task: using a tool called ''ssh-keyscan''. == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes 632acc02e2c1d2a9491fcd234e3fb17e013f635c C/C++ 0 14 545 391 2010-11-18T13:47:19Z Root 1 /* General */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Third, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> 81f498b70aaa6392ee4796c733e21af7f190a0ad 546 545 2010-11-19T12:26:08Z Root 1 /* General */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> 0ba1381c6319d27f2631b5b8c232b251d2bc3cd4 551 546 2011-01-14T16:57:54Z Root 1 /* General */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <syntaxhighlight lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </syntaxhighlight> be8abbf2d20129c10a2a3a35b4df2d8c934233ab 556 551 2011-01-14T17:13:20Z Root 1 wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> 0ba1381c6319d27f2631b5b8c232b251d2bc3cd4 561 556 2011-02-01T15:57:34Z Root 1 /* General */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == General == * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> * Implement delays in the execution of the program with help of [http://linux.die.net/man/2/nanosleep nanosleep]. Compared to sleep and usleep, nanosleep has the advantage of not affecting any signals, it is standardized by POSIX, it provides higher timing resolution, and it allows to continue a sleep that has been interrupted by a signal more easily. 03c7740696d4d617508df3fa59cc31a40b40300d 568 561 2011-03-18T12:41:47Z Root 1 wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == Tips & Tricks == * [http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] * Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] * [http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) * [http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] * Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> * Implement delays in the execution of the program with help of [http://linux.die.net/man/2/nanosleep nanosleep]. Compared to sleep and usleep, nanosleep has the advantage of not affecting any signals, it is standardized by POSIX, it provides higher timing resolution, and it allows to continue a sleep that has been interrupted by a signal more easily. 8bc7c208e13093f56a813f448a853498e870fd83 OpenMPI 0 47 548 444 2011-01-12T10:03:07Z Zhongziming 5 wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial debugger (eg:gdb)] ** 1. Attach to individual MPI processes after they are running. For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. An inelegant-but-functional technique commonly used with this method is to insert the following code in your application where you want to attach: { int i = 0; char hostname[256]; gethostname(hostname, sizeof(hostname)); printf("PID %d on %s ready for attach\n", getpid(), hostname); fflush(stdout); while (0 == i) sleep(5); } This code will output a line to stdout outputting the name of the host where the process is running and the PID to attach to. It will then spin on the sleep() function forever waiting for you to attach with a debugger. Using sleep() as the inside of the loop means that the processor won't be pegged at 100% while waiting for you to attach. Once you attach with a debugger, go up the function stack until you are in this block of code (you'll likely attach during the sleep()) then set the variable i to a nonzero value. With GDB, the syntax is: (gdb) set var i = 7 Then set a breakpoint after your block of code and continue execution until the breakpoint is hit. Now you have control of your live MPI application and use the full functionality of the debugger. You can even add conditionals to only allow this "pause" in the application for specific MPI processes (e.g., MPI_COMM_WORLD rank 0, or whatever process is misbehaving). ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. shell$ mpirun -np 4 xterm -e gdb my_mpi_application d7bfbca01a9be539b0171c49559533616f3a845c 549 548 2011-01-12T10:08:01Z Zhongziming 5 /* Debugging applications on Multiprocessors/Multicores */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial debugger (eg:gdb)] ** 1. Attach to individual MPI processes after they are running.<br /> For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers.<br /> shell$ mpirun -np 4 xterm -e gdb my_mpi_application 84c87b8c56fd26b07083e28cbc8472d74c5a2245 550 549 2011-01-12T10:14:15Z Zhongziming 5 /* Debugging applications on Multiprocessors/Multicores */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial Debugger (Gdb)] ** 1. Attach to individual MPI processes after they are running.<br /> For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. shell$ mpirun -np 4 xterm -e gdb my_mpi_application * [http://www.open-mpi.org/faq/?category=debugging#parallel-debuggers Parallel Debugers] ** [http://www.open-mpi.org/faq/?category=running#run-with-tv TotalView] ** [http://www.open-mpi.org/faq/?category=running#run-with-ddt DDT] 3be671e0586772137b6a8933f2d0f0a27924e7c8 HCL Cluster Network 0 68 557 468 2011-01-27T13:21:34Z Kiril 3 wikitext text/x-wiki <p align="center"><font size="5"><strong>Switch Management</strong></font></p> <p><font size="4">The cluster is connected via two Cisco Catalyst 3560G switches. The ingress bandwidth on any physical port can be configured to any value between 8kb/s and 1Gb/s. The switches are connected to each other via a gigabit sfp cable.</font></p> <p align="left"><font size="4">As the <a href="Specs.html">Cluster Specifications</a> show, each node has two Network Interfaces, each with its own IP Address. Each eth0 is connected to switch 1, and each eth1 is connected to switch 2. Which topology you wish to use will determine which IP address you should use when referring to each machine. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Switch Access</strong></font></p> <p><font size="4">The switches can be accessed through telnet from any machine on the cluster network. Type<br> telnet 192.168.21.252 for switch1 or telnet 192.168.21.253 for switch2, and the switch should prompt for a password. For the password, email brett becker.</font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Node Configuration</strong></font></p> <p><font size="4"> Here[http://www.linuxfoundation.org/collaborate/workgroups/networking/netem] is an excellent description how to introduce packet delays for outgoing traffic, which is a way to control latency. </p> <p align="center"><font size="5"><strong>Switch Configuration</strong></font></p> <p><font size="4"><br> For example, we will demonstrate how to limit the bandwidth of the connection between hcl01 and hcl02 to 100Mbps:<br> -----------------------------------<br> hcl02 $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification</font></p> <p><font size="4">Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Password: &lt;enter password&gt;<br> hclswitch1#configure terminal<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#policy-map example<br> hclswitch1(config-pmap)#class ipclass1<br> hclswitch1(config-pmap-c)#police 100000000 800000 exceed-action drop<br> hclswitch1(config-pmap-c)#exit<br> hclswitch1(config-pmap)#exit<br> hclswitch1(config)#interface gigabitEthernet0/3<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1(config)#interface gigabitEthernet0/4<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl02 $&gt;<br> ----------------------------------------</font></p> <p><font size="4">In the above example, we telnet to the switch and enter the password. We then type &#8220;enable&#8221; and enter the password again. Then we type &#8220;configure terminal&#8221; to bring us into config mode. We then create a new policy-map named &#8220;example&#8221; We bind this policy-map to class &#8220;ipclass1&#8221;. ipclass1 is a class that incorporates all ports on the switch. PLEASE DO NOT REMOVE THIS CLASS! We then enter the key statement police 100000000 800000 exceed-action drop. This tells the switch that this policy-map is to limit the bandwidth to 100Mbps, with a bucket burst of 80000, and if the rate is exceeded packets are to be dropped. We then exit policy-map configuration and enter interface gigabitEthernet0/3. This is the port that is connected to hcl03. We then attach the policy-map &#8220;example&#8221; to this interface. We then do the same for port gigabitEthernet0/4 which is connected to hcl04. The policy-map is then viewed using the show command to ensure that it is correct. </font></p> <p><font size="4">This example demonstrates an important fact about the cluster. hcl01&#8217;s eth0 (192.168.21.3) is connected to port gigabitEthernet0/1 on switch1. Similarly, hclx&#8217;s eth0 (192.168.21.x+2) is connected to gigabitEthernet0.x on switch1.<br> Further, hcl01&#8217;s eth1 (192.168.21.103) is connected to gigabitEthernet0/1 on switch2. Similarly, hclx&#8217;s eth1 (192.168.21.100+x+2) is connected to gigabitEthernet0/x on switch2. <br> The table below shows this comprehensively. </font></p> <div align="center"> <table width="53%" border="1" cellpadding="0"> <tr> <td width="20%"><strong>Machine</strong></td> <td width="23%"><strong>IP Address</strong></td> <td width="26%"><strong>Switch1 Port</strong></td> <td width="31%"><strong>Switch2 Port</strong></td> </tr> <tr> <td>hcl01</td> <td>192.168.21.3</td> <td>gigabitEthernet0/1</td> <td>N/A</td> </tr> <tr> <td>hcl01_eth1</td> <td>192.168.21.103</td> <td>N/A</td> <td>gigabitEthernet0/1</td> </tr> <tr> <td>hcl02</td> <td>192.168.21.4</td> <td>gigabitEthernet0/2</td> <td>N/A</td> </tr> <tr> <td>hcl02_eth1</td> <td>192.168.21.104</td> <td>N/A</td> <td>gigabitEthernet0/2</td> </tr> <tr> <td>hcl03</td> <td>192.168.21.5</td> <td>gigabitEthernet0/3</td> <td>N/A</td> </tr> <tr> <td>hcl03_eth1</td> <td><div align="left">192.168.21.105</div></td> <td>N/A</td> <td>gigabitEthernet0/3</td> </tr> <tr> <td>hcl04</td> <td>192.168.21.6</td> <td>gigabitEthernet0/4</td> <td>N/A</td> </tr> <tr> <td>hcl04_eth1</td> <td>192.168.21.106</td> <td>N/A</td> <td>gigabitEthernet0/4</td> </tr> <tr> <td>hcl05</td> <td>192.168.21.7</td> <td>gigabitEthernet0/5</td> <td>N/A</td> </tr> <tr> <td>hcl05_eth1</td> <td>192.168.21.107</td> <td>N/A</td> <td>gigabitEthernet0/5</td> </tr> <tr> <td>hcl06</td> <td>192.168.21.8</td> <td>gigabitEthernet0/6</td> <td>N/A</td> </tr> <tr> <td>hcl06_eth1</td> <td>192.168.21.108</td> <td>N/A</td> <td>gigabitEthernet0/6</td> </tr> <tr> <td>hcl07</td> <td>192.168.21.9</td> <td>gigabitEthernet0/7</td> <td>N/A</td> </tr> <tr> <td>hcl07_eth1</td> <td>192.168.21.109</td> <td>N/A</td> <td>gigabitEthernet0/7</td> </tr> <tr> <td>hcl08</td> <td>192.168.21.10</td> <td>gigabitEthernet0/8</td> <td>N/A</td> </tr> <tr> <td>hcl08_eth1</td> <td>192.168.21.110</td> <td>N/A</td> <td>gigabitEthernet0/8</td> </tr> <tr> <td>hcl09</td> <td>192.168.21.11</td> <td>gigabitEthernet0/9</td> <td>N/A</td> </tr> <tr> <td>hcl09_eth1</td> <td>192.168.21.111</td> <td>N/A</td> <td>gigabitEthernet0/9</td> </tr> <tr> <td>hcl10</td> <td>192.168.21.12</td> <td>gigabitEthernet0/10</td> <td>N/A</td> </tr> <tr> <td>hcl10_eth1</td> <td>192.168.21.112</td> <td>N/A</td> <td>gigabitEthernet0/10</td> </tr> <tr> <td>hcl11</td> <td>192.168.21.13</td> <td>gigabitEthernet0/11</td> <td>N/A</td> </tr> <tr> <td>hcl11_eth1</td> <td>192.168.21.113</td> <td>N/A</td> <td>gigabitEthernet0/11</td> </tr> <tr> <td>hcl12</td> <td>192.168.21.14</td> <td>gigabitEthernet0/12</td> <td>N/A</td> </tr> <tr> <td>hcl12_eth1</td> <td>192.168.21.114</td> <td>N/A</td> <td>gigabitEthernet0/12</td> </tr> <tr> <td>hcl13</td> <td>192.168.21.15</td> <td>gigabitEthernet0/13</td> <td>N/A</td> </tr> <tr> <td>hcl13_eth1</td> <td>192.168.21.115</td> <td>N/A</td> <td>gigabitEthernet0/13</td> </tr> <tr> <td>hcl14</td> <td>192.168.21.16</td> <td>gigabitEthernet0/14</td> <td>N/A</td> </tr> <tr> <td>hcl14_eth1</td> <td>192.168.21.116</td> <td>N/A</td> <td>gigabitEthernet0/14</td> </tr> <tr> <td>hcl15</td> <td>192.168.21.17</td> <td>gigabitEthernet0/15</td> <td>N/A</td> </tr> <tr> <td>hcl15_eth1</td> <td>192.168.21.117</td> <td>N/A</td> <td>gigabitEthernet0/15</td> </tr> <tr> <td>hcl16</td> <td>192.168.21.18</td> <td>gigabitEthernet0/16</td> <td>N/A</td> </tr> <tr> <td>hcl16_eth1</td> <td>192.168.21.118</td> <td>N/A</td> <td>gigabitEthernet0/16</td> </tr> </table> </div> <p></p> <p> </p> <p><font size="4">Limiting the bandwidth between the two switches is similar to the above example, but on switch1, Interface gigabitEthernet0/25 should be assigned the desired policy-map. Then on switch2, the same should be done. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Simulating Two Clusters</strong></font></p> <p><font size="4">If you want to simulate two clusters with the cluster, the following example should be considered.</font></p> <p><font size="4">Say we want Cluster A to be comprised of hcl01 and hcl02, and Cluster B, to comprised of hcl03 and hcl04. We want hcl01 and hcl02 to &#8220;talk&#8221; to each other at 10Mbps, and hcl03 and hcl04 to &#8220;talk&#8221; at 1Gbps. Additionally, we want to restrict the link between Clusters A and B to a bandwidth of 250Mbps. <br> To accomplish this we perform the following steps:</font></p> <p><font size="4">1.) Log onto hcl01 and make sure that eth0 is active and eth1 is not. This means that the machine is connected to switch1.</font></p> <p><font size="4">2.) Do the same for hcl02 </font></p> <p><font size="4">3.) Log onto switch1, and create a policy-map to limit the bandwidth to 10Mbps (10000000bps). Attach this policy-map to gigabitEthernet0/1 and gigabitEthernet0/2. </font></p> <p><font size="4">4.) Log onto hcl03 and perform the following steps:</font></p> <p><font size="4"> hcl03 $&gt; /sbin/ifconfig <br> This will list the active network devices. If the user before you cleaned up after him/her self, only &#8220;lo&#8221; (the loopback interface) and &#8220;eth0&#8221; should be listed. <br> hcl03 $&gt; /sbin/ifup eth1 (debian command is sudo /sbin/ifup eth1)<br> hcl03 $&gt; /sbin/ifdown eth0 (debian command is sudo /sbin/ifdown eth0)</font></p> <p><font size="4">This connects hcl03 to switch2, as eth1 is connected to switch2. Then hcl03 is disconnected from switch1, as eth0 is connected to switch1. MAKE SURE YOU ALWAYS BRING ONE INTERFACE UP BEFORE YOU BRING ANOTHER DOWN. OTHERWISE THE MACHINE WILL BE ISOLATED WITH NO ACTIVE NETWORK DEVICES. To see what devices are currently active, use the /sbin/ifconfig command. </font></p> <p><font size="4">5.) The same should be done for hcl04. </font></p> <p><font size="4">6.) Since we want these machines to talk at 1Gbps, we should log onto switch2 and make sure that there are no policy-maps existing. </font></p> <p><font size="4">7.) Log onto switch1 and create a new policy-map, limit it to 250000000bps and attach it to gigabitEthernet0/25. Do the same for switch2.</font></p> <p><font size="4">Done!</font></p> <p> </p> <p><font size="4">The following example shows how to delete a policy-map. YOU SHOULD ALWAYS REMEMBER TO DELETE YOUR POLICY-MAPS WHEN YOUR JOBS ARE DONE SO OTHER USER&#8217;S JOBS DON&#8217;T GET MESSED UP! </font></p> <p><font size="4">--------------------------------------------------------------<br> hcl0x $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification<br> Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> Password:<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#config t<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#no policy-map example<br> hclswitch1(config)#exit<br> hclswitch1#show policy-map<br> <br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl0x $&gt;<br> ---------------------------------------------------------------</font></p> 0074f0b7b5eac8eee06495cbc514cbc47df1e5ac 558 557 2011-01-27T13:23:20Z Kiril 3 wikitext text/x-wiki <p align="center"><font size="5"><strong>Switch Management</strong></font></p> <p><font size="4">The cluster is connected via two Cisco Catalyst 3560G switches. The ingress bandwidth on any physical port can be configured to any value between 8kb/s and 1Gb/s. The switches are connected to each other via a gigabit sfp cable.</font></p> <p align="left"><font size="4">As the <a href="Specs.html">Cluster Specifications</a> show, each node has two Network Interfaces, each with its own IP Address. Each eth0 is connected to switch 1, and each eth1 is connected to switch 2. Which topology you wish to use will determine which IP address you should use when referring to each machine. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Switch Access</strong></font></p> <p><font size="4">The switches can be accessed through telnet from any machine on the cluster network. Type<br> telnet 192.168.21.252 for switch1 or telnet 192.168.21.253 for switch2, and the switch should prompt for a password. For the password, email brett becker.</font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Node Configuration</strong></font></p> <p><font size="4"> Here[http://www.linuxfoundation.org/collaborate/workgroups/networking/netem] is an excellent description how to introduce packet delays for outgoing traffic, which is a way to control latency. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Switch Configuration</strong></font></p> <p><font size="4"><br> For example, we will demonstrate how to limit the bandwidth of the connection between hcl01 and hcl02 to 100Mbps:<br> -----------------------------------<br> hcl02 $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification</font></p> <p><font size="4">Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Password: &lt;enter password&gt;<br> hclswitch1#configure terminal<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#policy-map example<br> hclswitch1(config-pmap)#class ipclass1<br> hclswitch1(config-pmap-c)#police 100000000 800000 exceed-action drop<br> hclswitch1(config-pmap-c)#exit<br> hclswitch1(config-pmap)#exit<br> hclswitch1(config)#interface gigabitEthernet0/3<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1(config)#interface gigabitEthernet0/4<br> hclswitch1(config-if)#service-policy input example<br> hclswitch1(config-if)#exit<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl02 $&gt;<br> ----------------------------------------</font></p> <p><font size="4">In the above example, we telnet to the switch and enter the password. We then type &#8220;enable&#8221; and enter the password again. Then we type &#8220;configure terminal&#8221; to bring us into config mode. We then create a new policy-map named &#8220;example&#8221; We bind this policy-map to class &#8220;ipclass1&#8221;. ipclass1 is a class that incorporates all ports on the switch. PLEASE DO NOT REMOVE THIS CLASS! We then enter the key statement police 100000000 800000 exceed-action drop. This tells the switch that this policy-map is to limit the bandwidth to 100Mbps, with a bucket burst of 80000, and if the rate is exceeded packets are to be dropped. We then exit policy-map configuration and enter interface gigabitEthernet0/3. This is the port that is connected to hcl03. We then attach the policy-map &#8220;example&#8221; to this interface. We then do the same for port gigabitEthernet0/4 which is connected to hcl04. The policy-map is then viewed using the show command to ensure that it is correct. </font></p> <p><font size="4">This example demonstrates an important fact about the cluster. hcl01&#8217;s eth0 (192.168.21.3) is connected to port gigabitEthernet0/1 on switch1. Similarly, hclx&#8217;s eth0 (192.168.21.x+2) is connected to gigabitEthernet0.x on switch1.<br> Further, hcl01&#8217;s eth1 (192.168.21.103) is connected to gigabitEthernet0/1 on switch2. Similarly, hclx&#8217;s eth1 (192.168.21.100+x+2) is connected to gigabitEthernet0/x on switch2. <br> The table below shows this comprehensively. </font></p> <div align="center"> <table width="53%" border="1" cellpadding="0"> <tr> <td width="20%"><strong>Machine</strong></td> <td width="23%"><strong>IP Address</strong></td> <td width="26%"><strong>Switch1 Port</strong></td> <td width="31%"><strong>Switch2 Port</strong></td> </tr> <tr> <td>hcl01</td> <td>192.168.21.3</td> <td>gigabitEthernet0/1</td> <td>N/A</td> </tr> <tr> <td>hcl01_eth1</td> <td>192.168.21.103</td> <td>N/A</td> <td>gigabitEthernet0/1</td> </tr> <tr> <td>hcl02</td> <td>192.168.21.4</td> <td>gigabitEthernet0/2</td> <td>N/A</td> </tr> <tr> <td>hcl02_eth1</td> <td>192.168.21.104</td> <td>N/A</td> <td>gigabitEthernet0/2</td> </tr> <tr> <td>hcl03</td> <td>192.168.21.5</td> <td>gigabitEthernet0/3</td> <td>N/A</td> </tr> <tr> <td>hcl03_eth1</td> <td><div align="left">192.168.21.105</div></td> <td>N/A</td> <td>gigabitEthernet0/3</td> </tr> <tr> <td>hcl04</td> <td>192.168.21.6</td> <td>gigabitEthernet0/4</td> <td>N/A</td> </tr> <tr> <td>hcl04_eth1</td> <td>192.168.21.106</td> <td>N/A</td> <td>gigabitEthernet0/4</td> </tr> <tr> <td>hcl05</td> <td>192.168.21.7</td> <td>gigabitEthernet0/5</td> <td>N/A</td> </tr> <tr> <td>hcl05_eth1</td> <td>192.168.21.107</td> <td>N/A</td> <td>gigabitEthernet0/5</td> </tr> <tr> <td>hcl06</td> <td>192.168.21.8</td> <td>gigabitEthernet0/6</td> <td>N/A</td> </tr> <tr> <td>hcl06_eth1</td> <td>192.168.21.108</td> <td>N/A</td> <td>gigabitEthernet0/6</td> </tr> <tr> <td>hcl07</td> <td>192.168.21.9</td> <td>gigabitEthernet0/7</td> <td>N/A</td> </tr> <tr> <td>hcl07_eth1</td> <td>192.168.21.109</td> <td>N/A</td> <td>gigabitEthernet0/7</td> </tr> <tr> <td>hcl08</td> <td>192.168.21.10</td> <td>gigabitEthernet0/8</td> <td>N/A</td> </tr> <tr> <td>hcl08_eth1</td> <td>192.168.21.110</td> <td>N/A</td> <td>gigabitEthernet0/8</td> </tr> <tr> <td>hcl09</td> <td>192.168.21.11</td> <td>gigabitEthernet0/9</td> <td>N/A</td> </tr> <tr> <td>hcl09_eth1</td> <td>192.168.21.111</td> <td>N/A</td> <td>gigabitEthernet0/9</td> </tr> <tr> <td>hcl10</td> <td>192.168.21.12</td> <td>gigabitEthernet0/10</td> <td>N/A</td> </tr> <tr> <td>hcl10_eth1</td> <td>192.168.21.112</td> <td>N/A</td> <td>gigabitEthernet0/10</td> </tr> <tr> <td>hcl11</td> <td>192.168.21.13</td> <td>gigabitEthernet0/11</td> <td>N/A</td> </tr> <tr> <td>hcl11_eth1</td> <td>192.168.21.113</td> <td>N/A</td> <td>gigabitEthernet0/11</td> </tr> <tr> <td>hcl12</td> <td>192.168.21.14</td> <td>gigabitEthernet0/12</td> <td>N/A</td> </tr> <tr> <td>hcl12_eth1</td> <td>192.168.21.114</td> <td>N/A</td> <td>gigabitEthernet0/12</td> </tr> <tr> <td>hcl13</td> <td>192.168.21.15</td> <td>gigabitEthernet0/13</td> <td>N/A</td> </tr> <tr> <td>hcl13_eth1</td> <td>192.168.21.115</td> <td>N/A</td> <td>gigabitEthernet0/13</td> </tr> <tr> <td>hcl14</td> <td>192.168.21.16</td> <td>gigabitEthernet0/14</td> <td>N/A</td> </tr> <tr> <td>hcl14_eth1</td> <td>192.168.21.116</td> <td>N/A</td> <td>gigabitEthernet0/14</td> </tr> <tr> <td>hcl15</td> <td>192.168.21.17</td> <td>gigabitEthernet0/15</td> <td>N/A</td> </tr> <tr> <td>hcl15_eth1</td> <td>192.168.21.117</td> <td>N/A</td> <td>gigabitEthernet0/15</td> </tr> <tr> <td>hcl16</td> <td>192.168.21.18</td> <td>gigabitEthernet0/16</td> <td>N/A</td> </tr> <tr> <td>hcl16_eth1</td> <td>192.168.21.118</td> <td>N/A</td> <td>gigabitEthernet0/16</td> </tr> </table> </div> <p></p> <p> </p> <p><font size="4">Limiting the bandwidth between the two switches is similar to the above example, but on switch1, Interface gigabitEthernet0/25 should be assigned the desired policy-map. Then on switch2, the same should be done. </font></p> <p>&nbsp;</p> <p align="center"><font size="5"><strong>Simulating Two Clusters</strong></font></p> <p><font size="4">If you want to simulate two clusters with the cluster, the following example should be considered.</font></p> <p><font size="4">Say we want Cluster A to be comprised of hcl01 and hcl02, and Cluster B, to comprised of hcl03 and hcl04. We want hcl01 and hcl02 to &#8220;talk&#8221; to each other at 10Mbps, and hcl03 and hcl04 to &#8220;talk&#8221; at 1Gbps. Additionally, we want to restrict the link between Clusters A and B to a bandwidth of 250Mbps. <br> To accomplish this we perform the following steps:</font></p> <p><font size="4">1.) Log onto hcl01 and make sure that eth0 is active and eth1 is not. This means that the machine is connected to switch1.</font></p> <p><font size="4">2.) Do the same for hcl02 </font></p> <p><font size="4">3.) Log onto switch1, and create a policy-map to limit the bandwidth to 10Mbps (10000000bps). Attach this policy-map to gigabitEthernet0/1 and gigabitEthernet0/2. </font></p> <p><font size="4">4.) Log onto hcl03 and perform the following steps:</font></p> <p><font size="4"> hcl03 $&gt; /sbin/ifconfig <br> This will list the active network devices. If the user before you cleaned up after him/her self, only &#8220;lo&#8221; (the loopback interface) and &#8220;eth0&#8221; should be listed. <br> hcl03 $&gt; /sbin/ifup eth1 (debian command is sudo /sbin/ifup eth1)<br> hcl03 $&gt; /sbin/ifdown eth0 (debian command is sudo /sbin/ifdown eth0)</font></p> <p><font size="4">This connects hcl03 to switch2, as eth1 is connected to switch2. Then hcl03 is disconnected from switch1, as eth0 is connected to switch1. MAKE SURE YOU ALWAYS BRING ONE INTERFACE UP BEFORE YOU BRING ANOTHER DOWN. OTHERWISE THE MACHINE WILL BE ISOLATED WITH NO ACTIVE NETWORK DEVICES. To see what devices are currently active, use the /sbin/ifconfig command. </font></p> <p><font size="4">5.) The same should be done for hcl04. </font></p> <p><font size="4">6.) Since we want these machines to talk at 1Gbps, we should log onto switch2 and make sure that there are no policy-maps existing. </font></p> <p><font size="4">7.) Log onto switch1 and create a new policy-map, limit it to 250000000bps and attach it to gigabitEthernet0/25. Do the same for switch2.</font></p> <p><font size="4">Done!</font></p> <p> </p> <p><font size="4">The following example shows how to delete a policy-map. YOU SHOULD ALWAYS REMEMBER TO DELETE YOUR POLICY-MAPS WHEN YOUR JOBS ARE DONE SO OTHER USER&#8217;S JOBS DON&#8217;T GET MESSED UP! </font></p> <p><font size="4">--------------------------------------------------------------<br> hcl0x $&gt; telnet 192.168.21.252<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Trying 192.168.21.252...<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connected to 192.168.21.252 (192.168.21.252).<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Escape character is '^]'.</font></p> <p><font size="4">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;User Access Verification<br> Password: &lt;enter password&gt;<br> hclswitch1&gt;enable<br> Password:<br> hclswitch1#show policy-map<br> &nbsp;&nbsp;&nbsp;Policy Map example<br> &nbsp;&nbsp;&nbsp;&nbsp;Class ipclass1<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;police 100000000 800000 exceed-action drop<br> hclswitch1#config t<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Enter configuration commands, one per line. End with CNTL/Z.<br> hclswitch1(config)#no policy-map example<br> hclswitch1(config)#exit<br> hclswitch1#show policy-map<br> <br> hclswitch1#exit<br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection closed by foreign host.</font></p> <p><font size="4">hcl0x $&gt;<br> ---------------------------------------------------------------</font></p> 5b3a121719ba04a4fbf7b9a81b0045ae03452851 Dia 0 8 562 17 2011-03-11T15:59:18Z Root 1 wikitext text/x-wiki an alternative to Visio for making diagram http://live.gnome.org/Dia/ http://dia-installer.de/index_en.html (for Windows) 6ad1ad8ad46636d4d30aeaded4eacc6957954094 LaTeX 0 20 564 405 2011-03-11T16:04:52Z Root 1 wikitext text/x-wiki * [http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software * [http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == * Kile * Emacs + plugin * [[Eclipse]] + [[TeXlipse]] == Windows == * [http://miktex.org/ MiKTeX] - LaTeX implementation * [http://www.texniccenter.org/ TeXnicCenter] - editor 0b5cda1e0eac05ca1a7e441e1752db6b1ab557c5 569 564 2011-03-31T14:39:40Z Davepc 2 /* Editors */ wikitext text/x-wiki * [http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software * [http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [[TeXlipse]] == Windows == * [http://miktex.org/ MiKTeX] - LaTeX implementation * [http://www.texniccenter.org/ TeXnicCenter] - editor 3068dc0e2cead80e18d7e85a6eb53ec2014ea3a5 570 569 2011-03-31T14:44:29Z Davepc 2 wikitext text/x-wiki *[http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software *[http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides *Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings *Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [[TeXlipse]] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> 34a3244b3bcd4c1c77779e05f233d614fcca902c 571 570 2011-03-31T14:45:21Z Davepc 2 wikitext text/x-wiki *[http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software *[http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides *Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings *Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [[TeXlipse]] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> then <source lang="">pdflatex file.tex</source> be05c9f2700f1e9debec4ad26dbea47172cc78ba 572 571 2011-03-31T14:47:44Z Davepc 2 wikitext text/x-wiki *[http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software *[http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides *Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings *Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [[TeXlipse]] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> then <source lang="">pdflatex file.tex</source> If you include graphics without the extension both latex and pdflatex will work when you keep both .eps and .pdf files in your image directory. b799c8a03dc94086050ce09997b05477cbf0eaee JabRef 0 74 565 2011-03-11T16:07:20Z Root 1 Created page with "reference management software http://jabref.sourceforge.net/ http://en.wikipedia.org/wiki/JabRef Create a database of references for your PhD thesis: BibTeX + sources." wikitext text/x-wiki reference management software http://jabref.sourceforge.net/ http://en.wikipedia.org/wiki/JabRef Create a database of references for your PhD thesis: BibTeX + sources. a152cb0cf976a6468bcb415b96a7aec7007db901 Grid5000 0 6 573 428 2011-04-01T10:35:53Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) Setting up new deploy image ... launch image ... <source> deb http://ftp.fr.debian.org/debian/ sid main contrib non-free <source> a5c49e8ec3a4446165cd09445785e3d9277be440 574 573 2011-04-01T11:16:33Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) Setting up new deploy image ... launch image ... Edit /etc/apt/sources.list <source lang="text"> deb http://ftp.fr.debian.org/debian/ sid main contrib non-free </source> <source lang="bash"> apt-get update apt-get dist-upgrade apt-get install autoconf automake ctags </source> 5d85c3ebe4fb832e674f6c9d045829a29563ab2b 575 574 2011-04-01T11:17:07Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == ... launch image ... Edit /etc/apt/sources.list <source lang="text"> deb http://ftp.fr.debian.org/debian/ sid main contrib non-free </source> <source lang="bash"> apt-get update apt-get dist-upgrade apt-get install autoconf automake ctags </source> 3fc491d95fec647397511942d42b4f3b73d3976e 576 575 2011-04-01T13:21:15Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == ... launch image ... Edit /etc/apt/sources.list deb http://ftp.fr.debian.org/debian/ sid main contrib non-free === Upgrade image === apt-get update Got the error: W: GPG error: http://ftp.fr.debian.org sid Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY AED4B06F473041FA Fixed with apt-get install debian-archive-keyring apt-get update apt-get upgrade === Install packages === apt-get install linux-headers-2.6.38-2-all autoconf automake ctags gsl-bin libgsl0-dev vim mc colorgcc libboost-serialization-dev libboost-graph-dev openmpi-bin openmpi-dev libtool gdb valgrind libatlas-dev screen apt-get install mpich2 libmpich2-dev f7a6324737d7d1249a3814c9efe7ec3e29c01607 577 576 2011-04-04T13:33:51Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade === Install packages === apt-get install libtool autoconf automake ctags gsl-bin libgsl0-dev vim mc colorgcc libboost-serialization-dev libboost-graph-dev openmpi-bin openmpi-dev gdb valgrind libatlas-dev screen apt-get install mpich2 libmpich2-dev e69e6959da6d042d3e9829f1e068a20ccb17a221 578 577 2011-04-04T14:29:05Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev Other packages used but already included in lenny-big autoconf automake vim libboost-serialization-dev gdb valgrind screen Packages also used, compiled from sources by grid5000 mpich2 libmpich2-dev openmpi-bin openmpi-dev Compiled for sources by us: gsl-1.14 Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@node tgz-g5k > ../grid5000/imagename.tgz 3b05275023502606e9590d206d51a5ab37380ea7 579 578 2011-04-04T14:29:32Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev Other packages used but already included in lenny-big autoconf automake vim libboost-serialization-dev gdb valgrind screen Packages also used, compiled from sources by grid5000 mpich2 libmpich2-dev openmpi-bin openmpi-dev Compiled for sources by us: gsl-1.14 Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; ../grid5000/'''imagename'''.tgz 819577353cf243ed8f64fd36329cd88af5a0408f 580 579 2011-04-05T18:19:19Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev Other packages used but already included in lenny-big autoconf automake vim libboost-serialization-dev gdb valgrind screen Packages also used, compiled from sources by grid5000 mpich2 libmpich2-dev openmpi-bin openmpi-dev Compiled for sources by us: * gsl-1.14 * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; ../grid5000/'''imagename'''.tgz 8204efa6134017354cbd70c1abaf2791ba712c2c 583 580 2011-04-06T14:50:50Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen Packages also used, compiled from sources by grid5000 mpich2 libmpich2-dev openmpi-bin openmpi-dev Compiled for sources by us: * gsl-1.14 * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; ../grid5000/'''imagename'''.tgz bd3b9a61ea1169079636435660547b0c06be4b45 584 583 2011-04-06T14:57:57Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; ../grid5000/'''imagename'''.tgz 0166b3959297a8386dd6003bd397438afbe5cf26 585 584 2011-04-06T14:59:12Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz 9717a7d718c820d05cbac3f7e9687e05bf3d7702 586 585 2011-04-06T14:59:54Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 11c8c937d7a8959b2cc242e1d87d7b5c74626ffd 587 586 2011-04-07T13:23:43Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 421bc8a48a20df7e590e3f4a16612fe6639e9047 588 587 2011-04-13T14:10:44Z Root 1 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 8a849bc663af076f2be71fad964d4baf677e635a Grid5000 0 6 590 588 2011-05-03T16:34:01Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 ftp://ftp.gnu.org/gnu/gsl/ ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 099883ebd7664dad0f18f57b3dcca2fd62a73f77 591 590 2011-05-03T16:34:31Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 * gsl download: ftp://ftp.gnu.org/gnu/gsl/ ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 0d3f9367dcc860d07debc774727f844a12b808ed 592 591 2011-05-03T16:35:35Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure && make && make install * mpich2 ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 40236275ebf4094f0e47a78000ac055f5ae0a75d 593 592 2011-05-03T16:39:38Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure && make && make install * mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 1be769e871c4892a5e09212d40e44ea3a3277b34 594 593 2011-05-03T16:43:28Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure && make && make install * mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev and pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. 8639bdb96f618398726c0eb9b998b17abfaa68f1 595 594 2011-05-03T16:51:40Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure && make && make install * mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. fdee4876da3cb1f4ec8e64654e7e1d04731f1eaa 596 595 2011-05-03T18:35:04Z Zhongziming 5 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion Compiled for sources by us: * gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure && make && make install * mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make && make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local * hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure && make && make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy > lenny-x64-custom-2.3.env e7630a83efa23449076c6291ecf774415b69ff29 597 596 2011-05-19T02:37:34Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 < -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- > # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. add to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` fba99e3707c7c7ae376a4635e7d5a6e08f29cdbf 598 597 2011-05-19T02:42:24Z Davepc 2 /* GotoBLAS2 */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> 2dbd18cd920fe31d5c89e10c599e0cdbefef5a58 599 598 2011-05-19T15:10:51Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> c1bf7b82b3626077d4cbe7cb3f0f1751f8777460 607 599 2011-06-02T13:59:42Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. f1466a0f5f5f391bc16c40e4ce5b52aee2519498 614 607 2011-07-21T15:36:07Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == -Note: All these steps should probably be scripted into one command in the future. Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files for i in `cat ~/sites `; do echo $i; scp $i:deployed* . ; done cat deployed.* > deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i: ; done for i in `cat ~/deployed.rennes1`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0 ; make all install" ; done f71f15707c43c958261e07b8a6c07b43bd5e0282 615 614 2011-07-21T15:39:53Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == -Note: All these steps should probably be scripted into one command in the future. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* . ; done cat deployed.* > deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i: ; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0 ; make all install" ; done 265dc5e23c41abccf56b49cf430b80b741a8cbbe 616 615 2011-07-21T16:02:13Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == -Note: All these steps should probably be scripted into one command in the future. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* . ; done cat deployed.* > deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i: ; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0 ; make all install" ; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 a22c37b891a3fd252b0e7f397180251cafa13b63 618 616 2011-07-21T16:09:44Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == -Note: All these steps should probably be scripted into one command in the future. Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* . ; done cat deployed.* > deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i: ; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0 ; make all install" ; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 734d915662b1d3c922fcb777601a94938173b586 619 618 2011-07-21T16:10:08Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* . ; done cat deployed.* > deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i: ; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0 ; make all install" ; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 24815576ff2115617edd0590598591fd1699dc3d 620 619 2011-07-21T16:30:23Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 == Check network speed == apt-get install iperf 10718dbd0774563c0b1bc84c38e7eee42adc07db 621 620 2011-07-21T16:53:38Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.rennes or: for i in `cat sites`; do echo $i; ssh $i kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.$i; done Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 62c10ec1494d6c6a6c74ec94258810f3ba9e30c3 622 621 2011-07-21T16:58:08Z Davepc 2 /* Example of experiment setup across several sites */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 1b5d6037e7c895083b567945823ba547fa762207 623 622 2011-07-25T13:23:38Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> === Wish list === To be installed when next making an image iperf == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 139f810f0ca46741327a48ad969c1a7a99d7b074 629 623 2011-07-25T19:31:04Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> === Wish list === To be installed when next making an image iperf bc == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf f5c3d583ead04f4a1baa72a654c55d42d0d4bec7 630 629 2011-07-25T19:38:41Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> === Wish list === To be installed when next making an image iperf bc == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf abe2df7fe4b06f5be98f9406492be083a64e1fac 631 630 2011-07-25T19:38:56Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> === Wish list === To be installed when next making an image iperf bc == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf f9eb4a72af4fd26e63442d0d5851dc9a47c9e238 632 631 2011-07-25T20:08:58Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 > /proc/sys/vm/overcommit_ratio echo 2 > /proc/sys/vm/overcommit_memory Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> === Wish list === To be installed when next making an image iperf bc == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 0eb0aad93c21904c5b69fa787967f3d0dfbffe3e 633 632 2011-07-25T20:10:20Z Davepc 2 /* Wish list */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 > /proc/sys/vm/overcommit_ratio echo 2 > /proc/sys/vm/overcommit_memory Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 895066326424671b20e634a5eccd41fe472a135b 634 633 2011-07-25T20:11:19Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 > /proc/sys/vm/overcommit_ratio echo 2 > /proc/sys/vm/overcommit_memory date >> release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 7db83b839d7dc915e5d25f4c682a72c97395106b 635 634 2011-07-26T01:18:05Z Davepc 2 /* GotoBLAS2 */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 > /proc/sys/vm/overcommit_ratio echo 2 > /proc/sys/vm/overcommit_memory date >> release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf e3f86fa3bf44261db7ce3d3ce06e997b5a94fa35 Main Page 0 1 600 582 2011-05-23T09:52:32Z Root 1 /* Development tools */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 UTK multicores + GPU] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 8a7d95b2074e7230cb89330d93d931eaa5f3f005 605 600 2011-05-23T11:18:40Z Root 1 /* Hardware */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 518ed8b585aa84cf63de39beba1b3023f1c131ad 624 605 2011-07-25T13:24:51Z Davepc 2 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS/LAPACK ScaLAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) d2aca935cba0926b0abb8a7ce9d2c7a830f32052 627 624 2011-07-25T13:26:17Z Davepc 2 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) ea9c8ca70b6232d1f2bf93e664c6dfb8405591b8 637 627 2011-09-29T18:51:51Z Davepc 2 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) c1481a7eb07cbf8cbb99bb65cec241cc4222b8af UML 0 75 601 2011-05-23T09:57:29Z Root 1 Created page with "[http://en.wikipedia.org/wiki/Unified_Modeling_Language Unifiend Modeling Language]. Use any of the following tools: * [[Dia]] * [[Eclipse]] MDT-UML2 * [http://uml.sourceforge.ne…" wikitext text/x-wiki [http://en.wikipedia.org/wiki/Unified_Modeling_Language Unifiend Modeling Language]. Use any of the following tools: * [[Dia]] * [[Eclipse]] MDT-UML2 * [http://uml.sourceforge.net Umbrello] * [http://bouml.free.fr BOUML] 7927afbac800b591614109df3fcc884247fc044b 603 601 2011-05-23T11:13:23Z Root 1 wikitext text/x-wiki [http://en.wikipedia.org/wiki/Unified_Modeling_Language Unifiend Modeling Language]. Use any of the following tools: * [[Dia]] + [http://www.aarontrevena.co.uk/opensource/autodia/ AutoDia] * [[Eclipse]] MDT-UML2 * [http://uml.sourceforge.net Umbrello] * [http://bouml.free.fr BOUML] 3991722fd5dd67b5cd93e64ebc8d56021f43d3c4 604 603 2011-05-23T11:13:35Z Root 1 wikitext text/x-wiki [http://en.wikipedia.org/wiki/Unified_Modeling_Language Unifiend Modeling Language]. Use any of the following tools: * [[Dia]] + [http://www.aarontrevena.co.uk/opensource/autodia/ AutoDia] * [[Eclipse]] + MDT-UML2 * [http://uml.sourceforge.net Umbrello] * [http://bouml.free.fr BOUML] 929bc6cbf0893da04785a415a4e76b68a6e8fafa Dia 0 8 602 562 2011-05-23T11:12:46Z Root 1 wikitext text/x-wiki an alternative to Visio for making diagram http://live.gnome.org/Dia/ http://dia-installer.de/index_en.html (for Windows) http://www.aarontrevena.co.uk/opensource/autodia/ (UML reverse engineering: OO sources -> Dia) 04dc9414673a0698348bc766a1cb1c8d4a7cc72a UTK multicores + GPU 0 76 606 2011-05-23T11:19:27Z Root 1 Created page with "[http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 List of machines]" wikitext text/x-wiki [http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 List of machines] 95c3b0bca932defb16b98f64102c45098447b137 HCL Cluster Specifications 0 50 608 491 2011-06-13T16:38:28Z Davepc 2 /* Cluster Specifications */ wikitext text/x-wiki ==Cluster Specifications== {| border="1" cellspacing="1" cellpadding="5" | Name | Make/Model | IP | Processor | Front Side Bus | L2 Cache | RAM | HDD 1 | HDD 2 | NIC | Rack |- | ALIGN=LEFT | hclswitch1 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.252 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 42 |- | ALIGN=LEFT | hclswitch2 | ALIGN=LEFT | Cisco Catalyst 3560G | ALIGN=LEFT | 192.168.21.253 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | 24 x Gigabit | 41 |- | ALIGN=LEFT | N/A | ALIGN=LEFT | APC Smart UPS 1500 | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | ALIGN=LEFT | N/A | 1 – 2 |- | ALIGN=LEFT | [[Hcl01]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.3 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 3 |- | ALIGN=LEFT | Hcl01 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.103 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 3 |- | ALIGN=LEFT | [[Hcl02]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.4 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB (4x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | 250GB SATA | ALIGN=LEFT | 2 x Gigabit | 4 |- | ALIGN=LEFT | Hcl02 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.104 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 4 |- | ALIGN=LEFT | [[Hcl03]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.5 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB (2x512) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 5 |- | ALIGN=LEFT | Hcl03 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.105 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 5 |- | ALIGN=LEFT | [[Hcl04]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.6 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 6 |- | ALIGN=LEFT | Hcl04 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.106 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 6 |- | ALIGN=LEFT | [[Hcl05]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.7 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 7 |- | ALIGN=LEFT | Hcl05 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.107 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 7 |- | ALIGN=LEFT | [[Hcl06]] (NIC1) | ALIGN=LEFT | Dell Poweredge SC1425 | ALIGN=LEFT | 192.168.21.8 | ALIGN=LEFT | 3.0 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 256MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 8 |- | ALIGN=LEFT | Hcl06 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.108 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 8 |- | ALIGN=LEFT | [[Hcl07]](NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.9 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB (1x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 9 |- | ALIGN=LEFT | Hcl07 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.109 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 9 |- | ALIGN=LEFT | [[Hcl08]] (NIC1) | ALIGN=LEFT | Dell Poweredge 750 | ALIGN=LEFT | 192.168.21.10 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 256MB (1x256) | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 10 |- | ALIGN=LEFT | Hcl08 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.110 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 10 |- | ALIGN=LEFT | [[Hcl09]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.11 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 11 |- | ALIGN=LEFT | Hcl09 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.111 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 11 |- | ALIGN=LEFT | [[Hcl10]] (NIC1) | ALIGN=LEFT | IBM E-server 326 | ALIGN=LEFT | 192.168.21.12 | ALIGN=LEFT | 1.8 AMD Opteron | ALIGN=LEFT | 1GHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 12 |- | ALIGN=LEFT | Hcl10 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.112 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 12 |- | ALIGN=LEFT | [[Hcl11]] (NIC1) | ALIGN=LEFT | IBM X-Series 306 | ALIGN=LEFT | 192.168.21.13 | ALIGN=LEFT | 3.2 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 13 |- | ALIGN=LEFT | Hcl11 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.113 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 13 |- | ALIGN=LEFT | [[Hcl12]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.14 | ALIGN=LEFT | 3.4 P4 | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 512MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 14 |- | ALIGN=LEFT | Hcl12 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.114 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 14 |- | ALIGN=LEFT | [[Hcl13]] (NIC1) | ALIGN=LEFT | HP Proliant DL 320 G3 | ALIGN=LEFT | 192.168.21.15 | ALIGN=LEFT | 2.9 Celeron | ALIGN=LEFT | 533MHz | ALIGN=LEFT | 256KB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 15 |- | ALIGN=LEFT | Hcl13 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.115 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 15 |- | ALIGN=LEFT | [[Hcl14]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.16 | ALIGN=LEFT | 3.4 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 16 |- | ALIGN=LEFT | Hcl14 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.116 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 16 |- | ALIGN=LEFT | [[Hcl15]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.17 | ALIGN=LEFT | 2.8 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 1MB | ALIGN=LEFT | 1GB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 17 |- | ALIGN=LEFT | Hcl15 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.117 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 17 |- | ALIGN=LEFT | [[Hcl16]] (NIC1) | ALIGN=LEFT | HP Proliant DL 140 G2 | ALIGN=LEFT | 192.168.21.18 | ALIGN=LEFT | 3.6 Xeon | ALIGN=LEFT | 800MHz | ALIGN=LEFT | 2MB | ALIGN=LEFT | 384MB | ALIGN=LEFT | 80GB SATA | ALIGN=LEFT | N/A | ALIGN=LEFT | 2 x Gigabit | 18 |- | ALIGN=LEFT | Hcl16 (NIC2) | ALIGN=LEFT | <BR> | ALIGN=LEFT | 192.168.21.118 | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | ALIGN=LEFT | <BR> | 18 |} ==Cluster Benchmarks== ===Stream=== ====Cluster Performance==== ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Number of processors = 18 Array size = 2000000 Offset = 0 The total memory requirement is 824.0 MB ( 45.8MB/task) You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity appears to be less than one microsecond Your clock granularity/precision appears to be 1 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 24589.1750 0.0235 0.0234 0.0237 Scale: 24493.9786 0.0237 0.0235 0.0245 Add: 27594.1797 0.0314 0.0313 0.0315 Triad: 27695.7938 0.0313 0.0312 0.0315 ----------------------------------------------- Solution Validates! ----------------------------------------------- ====Individual Node Performance==== [[Image:stream_bench_results.png]] hcl01.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8386 microseconds. (= 8386 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2615.4617 0.0125 0.0122 0.0132 Scale: 2609.2783 0.0125 0.0123 0.0133 Add: 3046.4707 0.0161 0.0158 0.0168 Triad: 3064.7322 0.0160 0.0157 0.0166 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl02.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 7790 microseconds. (= 7790 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2740.6840 0.0117 0.0117 0.0120 Scale: 2745.3930 0.0117 0.0117 0.0117 Add: 3063.1599 0.0157 0.0157 0.0157 Triad: 3075.3572 0.0156 0.0156 0.0159 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl03.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8382 microseconds. (= 8382 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2788.6728 0.0115 0.0115 0.0115 Scale: 2722.0144 0.0118 0.0118 0.0121 Add: 3266.6166 0.0148 0.0147 0.0150 Triad: 3287.4249 0.0146 0.0146 0.0147 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl04.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 8378 microseconds. (= 8378 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2815.4071 0.0114 0.0114 0.0114 Scale: 2751.5270 0.0116 0.0116 0.0117 Add: 3260.1979 0.0148 0.0147 0.0150 Triad: 3274.2183 0.0147 0.0147 0.0150 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl05.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14916 microseconds. (= 14916 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1581.8068 0.0203 0.0202 0.0204 Scale: 1557.1081 0.0207 0.0206 0.0215 Add: 1807.7015 0.0266 0.0266 0.0266 Triad: 1832.9646 0.0263 0.0262 0.0265 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl06.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 14824 microseconds. (= 14824 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1558.3835 0.0206 0.0205 0.0206 Scale: 1550.3951 0.0207 0.0206 0.0209 Add: 1863.4239 0.0258 0.0258 0.0259 Triad: 1885.3916 0.0255 0.0255 0.0259 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl07.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13642 microseconds. (= 13642 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1759.4014 0.0183 0.0182 0.0186 Scale: 1740.2731 0.0184 0.0184 0.0185 Add: 2036.4962 0.0236 0.0236 0.0238 Triad: 2045.5920 0.0235 0.0235 0.0235 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl08.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 13476 microseconds. (= 13476 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1809.6454 0.0177 0.0177 0.0177 Scale: 1784.8271 0.0179 0.0179 0.0179 Add: 2085.3320 0.0231 0.0230 0.0232 Triad: 2095.7094 0.0231 0.0229 0.0239 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl09.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7890 microseconds. (= 3945 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2838.6204 0.0114 0.0113 0.0115 Scale: 2774.6396 0.0116 0.0115 0.0117 Add: 3144.0396 0.0155 0.0153 0.0156 Triad: 3160.7905 0.0153 0.0152 0.0156 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl10.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 2 ------------------------------------------------------------- Printing one line per active thread.... Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 2 microseconds. Each test below will take on the order of 7959 microseconds. (= 3979 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2864.2732 0.0113 0.0112 0.0114 Scale: 2784.2952 0.0116 0.0115 0.0118 Add: 3162.2553 0.0153 0.0152 0.0155 Triad: 3223.8430 0.0151 0.0149 0.0153 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl11.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 13082 microseconds. (= 13082 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1842.8984 0.0174 0.0174 0.0174 Scale: 1818.1816 0.0177 0.0176 0.0179 Add: 2109.8918 0.0228 0.0227 0.0231 Triad: 2119.0161 0.0227 0.0227 0.0227 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl12.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12621 microseconds. (= 12621 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1805.0657 0.0178 0.0177 0.0179 Scale: 1795.5385 0.0179 0.0178 0.0182 Add: 2089.9636 0.0230 0.0230 0.0230 Triad: 2097.5313 0.0229 0.0229 0.0231 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl13.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 12203 microseconds. (= 12203 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 1768.9387 0.0181 0.0181 0.0182 Scale: 1765.9100 0.0182 0.0181 0.0184 Add: 2000.5875 0.0241 0.0240 0.0244 Triad: 2001.0897 0.0240 0.0240 0.0241 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl14.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9576 microseconds. (= 9576 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2254.3285 0.0142 0.0142 0.0142 Scale: 2291.6040 0.0140 0.0140 0.0140 Add: 2779.5354 0.0173 0.0173 0.0173 Triad: 2804.8951 0.0177 0.0171 0.0215 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl15.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9788 microseconds. (= 9788 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2223.0043 0.0146 0.0144 0.0161 Scale: 2264.6703 0.0142 0.0141 0.0142 Add: 2740.5171 0.0176 0.0175 0.0178 Triad: 2762.2584 0.0174 0.0174 0.0174 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- hcl16.heterogeneous.ucd.ie ------------------------------------------------------------- STREAM version $Revision: 5.9 $ ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2000000, Offset = 0 Total memory required = 45.8 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Number of Threads requested = 1 ------------------------------------------------------------- Printing one line per active thread.... ------------------------------------------------------------- Your clock granularity appears to be less than one microsecond. Each test below will take on the order of 9590 microseconds. (= 9590 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2285.3882 0.0140 0.0140 0.0143 Scale: 2294.5687 0.0140 0.0139 0.0140 Add: 2800.9568 0.0172 0.0171 0.0172 Triad: 2823.2048 0.0170 0.0170 0.0171 ------------------------------------------------------------- Solution Validates ------------------------------------------------------------- e39f984ea9c4385f5ad4d5231949e19abb0e0276 Eclipse 0 9 609 440 2011-07-18T12:33:20Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Use either Sun Java (Debian package: sun-java6-jre) or OpenJDK (Debian package: openjdk-6-jre). GNU Java (GCJ) may be too slow. If you experience problems with Sun Java, it may be IPv6 - to resolve it, add '''a new line''' <code>-Djava.net.preferIPv4Stack=true</code> in the end of '''eclipse.ini''' (it must be after <code>-vmargs</code>) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - packages for C/C++ Developers or Parallel Application Developers * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Eclox]] * [[Subversive]] or [[Subclipse]] * [[TeXlipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 5241ec59c45f32534fa4e2eb196bc620f811e31e 613 609 2011-07-18T12:47:39Z Root 1 wikitext text/x-wiki http://www.eclipse.org * Use either Sun Java (Debian package: sun-java6-jre) or OpenJDK (Debian package: openjdk-6-jre). GNU Java (GCJ) may be too slow. If you experience problems with Sun Java, it may be IPv6 - to resolve it, add '''a new line''' <code>-Djava.net.preferIPv4Stack=true</code> in the end of '''eclipse.ini''' (it must be after <code>-vmargs</code>) * It is recommended to use the latest version from http://www.eclipse.org/downloads/ Eclipse CDT - packages for C/C++ Developers or Parallel Application Developers * How to install plugins: http://agile.csc.ncsu.edu/SEMaterials/tutorials/install_plugin/ http://www.venukb.com/2006/08/20/install-eclipse-plugins-the-easy-way/ == Plugins == * [[Eclox]] * [[Subversive]] or [[Subclipse]] == Usage == * To avoid unresolved inclusions at edit time, add paths to Project -> Properties -> C/C++ General -> Paths and Symbols -> Includes * The comments <code>// TODO: ...</code> mark what you are going to do later. These parts of code can easily be found if you open the Tasks view (Window -> Show View -> Tasks) 9dce2d5f32138d38a1b5ef0764653a26aab25ecc Eclox 0 11 610 210 2011-07-18T12:38:33Z Root 1 wikitext text/x-wiki [[Doxygen]] for [[Eclipse]] http://home.gna.org/eclox/ Eclipse install http://download.gna.org/eclox/update/site.xml If Doxygen is integrated in autoconf/automake scripts, there is no need in running Doxygen using Eclox, but the editor of Doxygen files provided by Eclox may be useful. 721a0b1359e6fcafc25a13166acb03e147b63408 611 610 2011-07-18T12:40:55Z Root 1 wikitext text/x-wiki [[Doxygen]] for [[Eclipse]] http://home.gna.org/eclox/ Eclipse install http://download.gna.org/eclox/update If Doxygen is integrated in autoconf/automake scripts, there is no need in running Doxygen using Eclox, but the editor of Doxygen files provided by Eclox may be useful. 86653f9fddc0a813dc31a07f66d1db7cfa12e761 LaTeX 0 20 612 572 2011-07-18T12:47:15Z Root 1 wikitext text/x-wiki *[http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software *[http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides *Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings *Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [http://texlipse.sourceforge.net/ TeXlipse] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> then <source lang="">pdflatex file.tex</source> If you include graphics without the extension both latex and pdflatex will work when you keep both .eps and .pdf files in your image directory. d8a226fdd0fa5f87c691079a8d084d9dfc3c4c6f Old HCL Cluster Specifications 0 66 617 463 2011-07-21T16:04:48Z Davepc 2 wikitext text/x-wiki Pre May 2010 Cluster specifications <TABLE FRAME=BOX CELLSPACING=0 COLS=12 RULES=GROUPS BORDER=1> <TR> <TD WIDTH=86 HEIGHT=16 ALIGN=LEFT>Rack Slot</TD> <TD WIDTH=116 ALIGN=LEFT>Name</TD> <TD WIDTH=160 ALIGN=LEFT>Make/Model</TD> <TD WIDTH=100 ALIGN=LEFT>O/S</TD> <TD WIDTH=100 ALIGN=LEFT>IP</TD> <TD WIDTH=111 ALIGN=LEFT>Processor</TD> <TD WIDTH=103 ALIGN=LEFT>Front Side Bus</TD> <TD WIDTH=86 ALIGN=LEFT>L2 Cache</TD> <TD WIDTH=86 ALIGN=LEFT>RAM</TD> <TD WIDTH=86 ALIGN=LEFT>HDD 1</TD> <TD WIDTH=86 ALIGN=LEFT>HDD 2</TD> <TD WIDTH=86 ALIGN=LEFT>NIC</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="42" SDNUM="1033;">42</TD> <TD ALIGN=LEFT>hclswitch1 </TD> <TD ALIGN=LEFT>Cisco Catalyst 3560G</TD> <TD ALIGN=LEFT>12.2(25)SEB2</TD> <TD ALIGN=LEFT>192.168.21.252</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>24 x Gigabit</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="41" SDNUM="1033;">41</TD> <TD ALIGN=LEFT>hclswitch2</TD> <TD ALIGN=LEFT>Cisco Catalyst 3560G</TD> <TD ALIGN=LEFT>12.2(25)SEB2</TD> <TD ALIGN=LEFT>192.168.21.253</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>24 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDNUM="1033;0;M/D/YY">1 - 2</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>APC Smart UPS 1500</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>N/A</TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="3" SDNUM="1033;">3</TD> <TD ALIGN=LEFT>Hcl01 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge SC1425</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.3</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>240GB SCSI</TD> <TD ALIGN=LEFT>80GB SCSI</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="3" SDNUM="1033;">3</TD> <TD ALIGN=LEFT>Hcl01 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.103</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="4" SDNUM="1033;">4</TD> <TD ALIGN=LEFT>Hcl02 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge SC1425</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.4</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>240GB SCSI</TD> <TD ALIGN=LEFT>80GB SCSI</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="4" SDNUM="1033;">4</TD> <TD ALIGN=LEFT>Hcl02 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.104</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="5" SDNUM="1033;">5</TD> <TD ALIGN=LEFT>Hcl03 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.5</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="5" SDNUM="1033;">5</TD> <TD ALIGN=LEFT>Hcl03 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.105</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="6" SDNUM="1033;">6</TD> <TD ALIGN=LEFT>Hcl04 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.6</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="6" SDNUM="1033;">6</TD> <TD ALIGN=LEFT>Hcl04 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.106</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="7" SDNUM="1033;">7</TD> <TD ALIGN=LEFT>Hcl05 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.7</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="7" SDNUM="1033;">7</TD> <TD ALIGN=LEFT>Hcl05 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.107</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="8" SDNUM="1033;">8</TD> <TD ALIGN=LEFT>Hcl06 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.8</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="8" SDNUM="1033;">8</TD> <TD ALIGN=LEFT>Hcl06 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.108</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="9" SDNUM="1033;">9</TD> <TD ALIGN=LEFT>Hcl07 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.9</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="9" SDNUM="1033;">9</TD> <TD ALIGN=LEFT>Hcl07 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.109</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="10" SDNUM="1033;">10</TD> <TD ALIGN=LEFT>Hcl08 (NIC1)</TD> <TD ALIGN=LEFT>Dell Poweredge 750</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.10</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>256MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="10" SDNUM="1033;">10</TD> <TD ALIGN=LEFT>Hcl08 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.110</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="11" SDNUM="1033;">11</TD> <TD ALIGN=LEFT>Hcl09 (NIC1)</TD> <TD ALIGN=LEFT>IBM E-server 326</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.11</TD> <TD ALIGN=LEFT>1.8 AMD Opteron</TD> <TD ALIGN=LEFT>1GHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="11" SDNUM="1033;">11</TD> <TD ALIGN=LEFT>Hcl09 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.111</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="12" SDNUM="1033;">12</TD> <TD ALIGN=LEFT>Hcl10 (NIC1)</TD> <TD ALIGN=LEFT>IBM E-server 326</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.12</TD> <TD ALIGN=LEFT>1.8 AMD Opteron</TD> <TD ALIGN=LEFT>1GHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="12" SDNUM="1033;">12</TD> <TD ALIGN=LEFT>Hcl10 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.112</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="13" SDNUM="1033;">13</TD> <TD ALIGN=LEFT>Hcl11 (NIC1)</TD> <TD ALIGN=LEFT>IBM X-Series 306</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.13</TD> <TD ALIGN=LEFT>3.2 P4</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>512MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="13" SDNUM="1033;">13</TD> <TD ALIGN=LEFT>Hcl11 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.113</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="14" SDNUM="1033;">14</TD> <TD ALIGN=LEFT>Hcl12 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 320 G3</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.14</TD> <TD ALIGN=LEFT>3.4 P4</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>512MB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="14" SDNUM="1033;">14</TD> <TD ALIGN=LEFT>Hcl12 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.114</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="15" SDNUM="1033;">15</TD> <TD ALIGN=LEFT>Hcl13 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 320 G3</TD> <TD ALIGN=LEFT>FC4</TD> <TD ALIGN=LEFT>192.168.21.15</TD> <TD ALIGN=LEFT>2.9 Celeron</TD> <TD ALIGN=LEFT>533MHz</TD> <TD ALIGN=LEFT>256KB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="15" SDNUM="1033;">15</TD> <TD ALIGN=LEFT>Hcl13 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.115</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="16" SDNUM="1033;">16</TD> <TD ALIGN=LEFT>Hcl14 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.16</TD> <TD ALIGN=LEFT>3.4 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="16" SDNUM="1033;">16</TD> <TD ALIGN=LEFT>Hcl14 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.116</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="17" SDNUM="1033;">17</TD> <TD ALIGN=LEFT>Hcl15 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.17</TD> <TD ALIGN=LEFT>2.8 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>1MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="17" SDNUM="1033;">17</TD> <TD ALIGN=LEFT>Hcl15 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.117</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> <TR> <TD HEIGHT=22 ALIGN=RIGHT SDVAL="18" SDNUM="1033;">18</TD> <TD ALIGN=LEFT>Hcl16 (NIC1)</TD> <TD ALIGN=LEFT>HP Proliant DL 140 G2</TD> <TD ALIGN=LEFT>Debian</TD> <TD ALIGN=LEFT>192.168.21.18</TD> <TD ALIGN=LEFT>3.6 Xeon</TD> <TD ALIGN=LEFT>800MHz</TD> <TD ALIGN=LEFT>2MB</TD> <TD ALIGN=LEFT>1GB</TD> <TD ALIGN=LEFT>80GB SATA</TD> <TD ALIGN=LEFT>N/A</TD> <TD ALIGN=LEFT>2 x Gigabit</TD> </TR> <TR> <TD HEIGHT=16 ALIGN=RIGHT SDVAL="18" SDNUM="1033;">18</TD> <TD ALIGN=LEFT>Hcl16 (NIC2)</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT>192.168.21.118</TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> <TD ALIGN=LEFT><BR></TD> </TR> </TABLE> 29c1d28ea777b0b83f2123dcf8b5131f814562a4 BLAS LAPACK ScaLAPACK 0 15 625 392 2011-07-25T13:26:03Z Davepc 2 moved [[BLAS/LAPACK]] to [[BLAS LAPACK ScaLAPACK]]:&#32;Adding ScaLAPACK documentation. wikitext text/x-wiki A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK] * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK). * ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2] * MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage] f713e9c868a10a73b0cc30b9abb57dfab67a3ef3 628 625 2011-07-25T13:26:56Z Davepc 2 wikitext text/x-wiki A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK] * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK). * ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2] * MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage] = ScaLAPACK = http://www.netlib.org/scalapack/ 59daa79d93224365f36a6216aebe320ba6aee42a BLAS/LAPACK 0 77 626 2011-07-25T13:26:03Z Davepc 2 moved [[BLAS/LAPACK]] to [[BLAS LAPACK ScaLAPACK]]:&#32;Adding ScaLAPACK documentation. wikitext text/x-wiki #REDIRECT [[BLAS LAPACK ScaLAPACK]] fa01d21160767f6a1d3e4ab0033470058ff93f26 HCL cluster 0 5 636 547 2011-08-04T12:35:02Z Davepc 2 /* Cluster Administration */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == If PBS jobs do not start after a reboot of heterogeneous.ucd.ie it may be necessary to manually start maui: /usr/local/maui/sbin/maui ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 7fa928860e50e1036d76fd58e9197c40f4c92086 NLOPT 0 78 638 2011-09-29T18:54:39Z Davepc 2 Created page with "[http://ab-initio.mit.edu/wiki/index.php/NLopt Download nlopt from here] Unpack and configure with: ./configure --with-cxx --enable-shared make make install" wikitext text/x-wiki [http://ab-initio.mit.edu/wiki/index.php/NLopt Download nlopt from here] Unpack and configure with: ./configure --with-cxx --enable-shared make make install 954ec13b8c84c6e94302325e1d1a1f5b5f7000f7 MPI 0 29 639 375 2011-11-17T20:53:59Z Davepc 2 wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. == Profiling == [http://www.bsc.es/plantillaA.php?cat_id=488 Paraver] by Barcelona Supercomputing Center is a "a flexible performance visualization and analysis tool" 023ddb078b33d8276c5edf48316710cde6d73c19 Grid5000 0 6 640 635 2011-11-28T17:12:58Z Davepc 2 /* Login, job submission, deployment of image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc <br> Compiled for sources by us: *gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/) ./configure &amp;&amp; make &amp;&amp; make install *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 > /proc/sys/vm/overcommit_ratio echo 2 > /proc/sys/vm/overcommit_memory date >> release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 68df42ff509883fa48112d2d2acf2c96c2fca13e 641 640 2011-11-28T17:32:29Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)</strike> <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 0c406e18ebb179c468af9d7d75f1732d8330c66e 642 641 2011-11-28T17:33:25Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf b682c7b752a890cddbec9dcc5370edb9ee60aeae 643 642 2011-11-28T18:32:27Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e lenny-x64-nfs -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf eace9efecb21ff88ba210d6b29e97ea99fddbb29 658 643 2012-01-10T15:27:57Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf 0f1f9062ca26566c7d1683ba440dc32fb7bd81e7 659 658 2012-01-10T17:21:01Z Davepc 2 /* Check network speed */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... b45fae8c374b7ddea72c318169db196d02b4ab6c 660 659 2012-01-11T16:54:14Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install: libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 5adbaaa890d835c23ecc797bf1d4f24b15afd428 661 660 2012-01-11T16:55:12Z Davepc 2 /* Setting up new deploy image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY] == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL_cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.* ; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 0c03eeec7ca2434b19d130ff7fe37eca8f76da1d 679 661 2012-02-16T23:17:52Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - very important: After booking nodes (oarsub ...) run the command <source lang="">outofchart</source>. This will check that you haven't booked too many resources and therefore get in trouble with grid5000 ad == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 194bba5fac848c07a942f96f2d87a2ef476deb78 680 679 2012-02-16T23:18:25Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 ad == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... b494ca4cda75f45dc1b62ab87408375a6f617d90 681 680 2012-02-16T23:18:49Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 22034a77aae780acf751023a0cf49dde961b1efe Main Page 0 1 644 637 2011-12-13T11:23:44Z Kiril 3 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 5d9cf6d8cf4b9d54db7eeab3d9c347187bfc0b87 662 644 2012-01-30T16:54:27Z Root 1 /* Development tools */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] == Data processing == * [[gnuplot]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 8f42f294510c71263183cd3d6dd68f711daa6daf 665 662 2012-01-30T17:09:12Z Root 1 /* Data processing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] == Data processing == * [[gnuplot]], [[PGF/Tikz]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) fab0b72703de3bf94434a044a7c1768878c04ccc 666 665 2012-01-30T17:12:44Z Root 1 /* Libraries */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[PGF/Tikz]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 7a5ad0a90d0dfcb500241d3c1ae5af54ebcbedde 667 666 2012-01-30T17:16:01Z Root 1 /* Data processing */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]] * [[LaTeX]] * [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 7fa805529b6016e18b9e5d2dadd114994658432b 668 667 2012-01-30T17:18:26Z Root 1 /* Paper & Presentation Tools */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 41c543f1ac6aa029c1733854671c9b792eca8ee6 673 668 2012-01-30T17:26:11Z Quintin 10 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) dabd74770ea31730c853fc7227cdfbce5222ff00 682 673 2012-03-13T16:22:02Z Root 1 /* Paper & Presentation Tools */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 2aa9c65b12112bc55a6aa0c42ef4a52a96aae27a BitTorrent (B. Cohen's version) 0 79 645 2011-12-13T11:42:45Z Kiril 3 Created page with "The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Create a file…" wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Start one client that has the full copy: btdownloadheadless.py --url http://my.server/myfile.torrent --saveas myfile.ext 742b2aa3a2a8b02c01a1c0a8003596cb5519f3b2 646 645 2011-12-13T11:44:56Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy: btdownloadheadless.py --url http://my.server/myfile.torrent --saveas myfile.ext * On all other nodes, launch the client at the same time: 91a89aff59979158dbeaa3eaf91e0171e119fd51 647 646 2011-12-13T11:49:45Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy, on all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh da29d30874edca827c56dc0e96963c9fe830fde8 648 647 2011-12-13T11:50:36Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh 2716857542dc9ee4842fda3fce94b6482ac49da6 649 648 2011-12-13T12:05:45Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory * export following variables to tweak the way Python paths are searched: export PYTHONHOME=$HOME export PYTHONPATH=/usr/lib * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh 6ce21c253b880b6c92409dffe80c6da6eebf2a2a 650 649 2011-12-13T12:06:35Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory ** You might need mv $HOME/lib/python $HOME/lib/python2.6 * export following variables to tweak the way Python paths are searched: export PYTHONHOME=$HOME export PYTHONPATH=/usr/lib/python2.6/ * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh cdaa0f6ad893429ecb002342e28b9a3e59935f3b 651 650 2011-12-13T12:27:12Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory ** You might need mv $HOME/lib/python $HOME/lib/python2.6 * export following variables to tweak the way Python paths are searched: export PYTHONHOME=$HOME export PYTHONPATH=/usr/lib/python2.6/ * modify logfile path in /home/kdichev/lib/python2.6/BitTorrent/StorageWrapper.py * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh 5c460b3040306c498c455982a9b645f14b022a3f 652 651 2011-12-13T12:27:21Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory ** You might need mv $HOME/lib/python $HOME/lib/python2.6 * export following variables to tweak the way Python paths are searched: export PYTHONHOME=$HOME export PYTHONPATH=/usr/lib/python2.6/ * Modify logfile path in /home/kdichev/lib/python2.6/BitTorrent/StorageWrapper.py * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh fe5685a8765eccc7efecb171a9d3b0a54e6b7d7a 653 652 2011-12-13T12:27:57Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory ** You might need mv $HOME/lib/python $HOME/lib/python2.6 * export following variables to tweak the way Python paths are searched: export PYTHONHOME=$HOME export PYTHONPATH=/usr/lib/python2.6/ * Modify logfile path in /home/kdichev/lib/python2.6/BitTorrent/StorageWrapper.py and create according directory structure * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh 83697cb79c1ff3c33221fbebd5a93c9a93921833 OpenMPI 0 47 654 550 2011-12-15T15:07:08Z Kiril 3 wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial Debugger (Gdb)] ** 1. Attach to individual MPI processes after they are running.<br /> For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. shell$ mpirun -np 4 xterm -e gdb my_mpi_application * [http://www.open-mpi.org/faq/?category=debugging#parallel-debuggers Parallel Debugers] ** [http://www.open-mpi.org/faq/?category=running#run-with-tv TotalView] ** [http://www.open-mpi.org/faq/?category=running#run-with-ddt DDT] == PERUSE == caaf32c221e9cd42d7a6cc653216b672007bef69 656 654 2011-12-15T15:09:40Z Kiril 3 /* PERUSE */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial Debugger (Gdb)] ** 1. Attach to individual MPI processes after they are running.<br /> For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. shell$ mpirun -np 4 xterm -e gdb my_mpi_application * [http://www.open-mpi.org/faq/?category=debugging#parallel-debuggers Parallel Debugers] ** [http://www.open-mpi.org/faq/?category=running#run-with-tv TotalView] ** [http://www.open-mpi.org/faq/?category=running#run-with-ddt DDT] == PERUSE == [[Media:current_peruse_spec.pdf]] f0461cd2ee84c1c897dd01cdc39dd4607239bf80 657 656 2011-12-15T15:10:55Z Kiril 3 /* PERUSE */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfiles] == Debugging applications on Multiprocessors/Multicores == * [http://www.open-mpi.org/faq/?category=debugging#serial-debuggers Serial Debugger (Gdb)] ** 1. Attach to individual MPI processes after they are running.<br /> For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. ** 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. shell$ mpirun -np 4 xterm -e gdb my_mpi_application * [http://www.open-mpi.org/faq/?category=debugging#parallel-debuggers Parallel Debugers] ** [http://www.open-mpi.org/faq/?category=running#run-with-tv TotalView] ** [http://www.open-mpi.org/faq/?category=running#run-with-ddt DDT] == PERUSE == [[Media:current_peruse_spec.pdf|PERUSE Specification]] 7d6509445f371dcc9e7d4e86bcca0e1d2f25c827 File:Current peruse spec.pdf 6 80 655 2011-12-15T15:08:06Z Kiril 3 PERUSE specification wikitext text/x-wiki PERUSE specification 0d2c1378cc4737ac140c1b75f2e16e2456bee779 GDB 0 81 663 2012-01-30T16:54:29Z Davepc 2 Created page with "Debugging with GDB compile programme with -g -o0 (or ./configure --enable-debug) For serial programme gdb ./programme_name" wikitext text/x-wiki Debugging with GDB compile programme with -g -o0 (or ./configure --enable-debug) For serial programme gdb ./programme_name 3f5737a176a02b3538050c8471df6b0a1d0e0d25 677 663 2012-01-30T18:02:44Z Davepc 2 wikitext text/x-wiki Debugging with GDB compile programme with -g -o0 (or ./configure --enable-debug) For serial programme (or MPI running with 1 process) gdb ./programme_name To debug MPI application in parallel add this line to somewhere in the code: if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); Then run the code and it will hang on that line. e05b3ff1563deced85e64b21411601943ac78445 FORTRAN 0 82 664 2012-01-30T16:55:05Z Root 1 Created page with "[http://www.yolinux.com/TUTORIALS/LinuxTutorialMixingFortranAndC.html Tutorial on mixing FORTRAN and C/C++ code]" wikitext text/x-wiki [http://www.yolinux.com/TUTORIALS/LinuxTutorialMixingFortranAndC.html Tutorial on mixing FORTRAN and C/C++ code] 4627cfca87cf697e3943dcd2593568b4ffc04d38 BibTeX 0 83 669 2012-01-30T17:23:33Z Root 1 Created page with "[http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software" wikitext text/x-wiki [http://en.wikipedia.org/wiki/BibTeX BibTeX] - a reference management software 9ed28861fde593358787899aa4d4334f4c5102ea Beamer 0 84 670 2012-01-30T17:23:53Z Root 1 Created page with "[http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides" wikitext text/x-wiki [http://en.wikipedia.org/wiki/Beamer_(LaTeX) Beamer] - a package for presentation slides 4d8c59c253b6ba1fcef3549d7631741276395314 LaTeX 0 20 671 612 2012-01-30T17:24:38Z Root 1 wikitext text/x-wiki *[[BibTeX]] - a reference management software *[[Beamer]] - a package for presentation slides *[[PGF/Tikz]] * Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings * Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [http://texlipse.sourceforge.net/ TeXlipse] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> then <source lang="">pdflatex file.tex</source> If you include graphics without the extension both latex and pdflatex will work when you keep both .eps and .pdf files in your image directory. 6d840b68a2d735368aa7d769da60175d9c6e9328 672 671 2012-01-30T17:24:52Z Root 1 wikitext text/x-wiki *[[BibTeX]] - a reference management software *[[Beamer]] - a package for presentation slides *[[PGF/Tikz]] *Listings - a package for putting programming code within LaTeX - http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=listings *Latex can be used in the [[Doxygen]] documentation in order to inlcude formulas, biblio references etc == Editors == *vim with [http://vim-latex.sourceforge.net/index.php?subject=manual&title=Tutorial#tutorial vim-latex] *Kile *Emacs + plugin *[[Eclipse]] + [http://texlipse.sourceforge.net/ TeXlipse] == Windows == *[http://miktex.org/ MiKTeX] - LaTeX implementation *[http://www.texniccenter.org/ TeXnicCenter] - editor == Working with eps images == .tex with .eps images <source lang="">latex file.tex dvipdf file.tex</source> or convert all eps images to pdf <source lang="">for i in `ls *.eps`; do echo $i; epspdf $i; done</source> then <source lang="">pdflatex file.tex</source> If you include graphics without the extension both latex and pdflatex will work when you keep both .eps and .pdf files in your image directory. e8ad85ff490656f05e89cf39d26dcdd8ae212758 PGF/Tikz 0 85 674 2012-01-30T17:26:54Z Quintin 10 Created page with "=PGF/Tikz=" wikitext text/x-wiki =PGF/Tikz= 8f35d06ea85d25f3894bae00ed857881b7f087bf 676 674 2012-01-30T17:43:30Z Quintin 10 wikitext text/x-wiki =Write a figure= The preamble of the latex file must contain: <source lang="latex">\usepackage{tikz}</source> Some optional libraries could be add like this: <source lang="latex">\usetikzlibrary{calc}</source> To start a figure, the code must be inside the tikzpicture environment like this: <source lang="latex">\begin{tikzpicture} ... TikZ code here... \end{tikzpicture}</source> =exemple= <source lang="latex"> % Author: Quintin Jean-Noël % <http://moais.imag.fr/membres/jean-noel.quintin/> \documentclass{article} \usepackage{tikz} \usetikzlibrary[topaths] % A counter, since TikZ is not clever enough (yet) to handle % arbitrary angle systems. \newcount\mycount \begin{document} \begin{tikzpicture}[transform shape] %the multiplication with floats is not possible. Thus I split the loop %in two. \foreach \number in {1,...,8}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 0 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {9,...,16}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 22.5 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {1,...,15}{ \mycount=\number \advance\mycount by 1 \foreach \numbera in {\the\mycount,...,16}{ \path (N-\number) edge[->,bend right=3] (N-\numbera) edge[<-,bend left=3] (N-\numbera); } } \end{tikzpicture} \end{document} </source> e26ce97d98b149e10d8f2bb7a8dc21496e081a62 683 676 2012-03-13T16:25:50Z Root 1 wikitext text/x-wiki =Write a figure= The preamble of the latex file must contain: <source lang="latex">\usepackage{tikz}</source> Some optional libraries could be add like this: <source lang="latex">\usetikzlibrary{calc}</source> To start a figure, the code must be inside the tikzpicture environment like this: <source lang="latex">\begin{tikzpicture} ... TikZ code here... \end{tikzpicture}</source> =Exemple= <source lang="latex"> % Author: Quintin Jean-Noël % <http://moais.imag.fr/membres/jean-noel.quintin/> \documentclass{article} \usepackage{tikz} \usetikzlibrary[topaths] % A counter, since TikZ is not clever enough (yet) to handle % arbitrary angle systems. \newcount\mycount \begin{document} \begin{tikzpicture}[transform shape] %the multiplication with floats is not possible. Thus I split the loop %in two. \foreach \number in {1,...,8}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 0 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {9,...,16}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 22.5 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {1,...,15}{ \mycount=\number \advance\mycount by 1 \foreach \numbera in {\the\mycount,...,16}{ \path (N-\number) edge[->,bend right=3] (N-\numbera) edge[<-,bend left=3] (N-\numbera); } } \end{tikzpicture} \end{document} </source> =Voir aussi= * [[pfgplot]] a274f65ceece827b60953a977bd04b0208a248e6 Matplotlib 0 86 675 2012-01-30T17:28:02Z Jun.zhu 11 Created page with "http://matplotlib.sourceforge.net/" wikitext text/x-wiki http://matplotlib.sourceforge.net/ d6cd62b8b2e4285f27093e154bdbf4ca84d853bc Gnuplot 0 22 678 410 2012-02-16T23:14:37Z Davepc 2 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] [http://t16web.lanl.gov/Kawano/gnuplot/index-e.html GNUPLOT: not so Frequently Asked Questions] When plotting "points" data files from fupermod, you will need [http://gnuplot.sourceforge.net/docs_4.2/node172.html this]:&nbsp; &nbsp;set datafile missing "."<br> cb96ce3723c23406d4559c9d3be720144a6ef9dd Linux 0 3 686 411 2012-03-13T16:32:00Z Root 1 /* Utilities */ wikitext text/x-wiki == Environment == * '''.*rc''' - for non-login * shell * '''.*profile''' - for login * shell, uses the rc settings == Utilities == * '''mc''' (midnight commander) - a file manager with a built-in text editor. To copy text, hold the shift button. * '''cg, vg''' (Code Grep and Vi Grepped) - tools for finding and modifying on keywords == Tips and Tricks == * [[SSH|How to connect via SSH]] * Use <code>update-alternatives --config NAME</code> to switch between different software implementations. For example, <code>update-alternatives --config java</code> allows you to switch between Sun, OpenJDK and GNU java e37a186e0be5daf6b2a17f849e01b119f22a9b11 Subversion 0 19 687 42 2012-03-14T15:34:07Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ * Subversion clients work with <code>.svn</code> directories - don't remove them. * Mind the version of the client (currently, 1.5, 1.6). == Repositories == * http://hcl.ucd.ie/repos/project_name - read only * https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == * Software sources: models, code, resource files * Documentation sources: texts, diagrams, data * Configuration files * Test sourses: code, input data == Not to submit == * Binaries: object files, libraries, executables * Built documentation: html, pdf * Personal settings: Eclipse projects, ... * Test output = Subversion for Users = A good linux client: RapidSVN 7b9703b41e308dfc4b4137cac2f8d4acb3ea36d3 688 687 2012-03-14T15:49:42Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://hcl.ucd.ie/repos/project_name - read only *https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN]. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note current ubuntu usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line:<br> &nbsp;gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) c5e734cd68a15ffeb878724041e69dfc69cb6bf8 689 688 2012-03-14T15:50:00Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://hcl.ucd.ie/repos/project_name - read only *https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN]. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note current ubuntu usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line:<br> &nbsp;gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no <br> Then check out with: &nbsp;svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) 8a9816f51b773760dcffb16ea02a2677f3e420b0 690 689 2012-03-14T15:50:50Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://hcl.ucd.ie/repos/project_name - read only *https://hcl.ucd.ie/repos/project_name - authenticated user access == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN]. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note current ubuntu usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) b72a8172d764b4640779eb23a8760fed85f2adf5 691 690 2012-03-14T16:00:36Z Davepc 2 /* Repositories */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN]. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note current ubuntu usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) 61e4a1c1c8a5b86563e5ba07d951c460b73be718 Boost 0 26 692 183 2012-04-11T11:41:59Z Quintin 10 wikitext text/x-wiki http://www.boost.org/ == Installation from sources == 1. By default, boost is configured with all libraries. To save time on building boost, you can configure it ony with the libraries you need: <source lang="bash"> $ ./b2 --prefix=DIR --with-libraries=graph,serialization </source> To install boost with mpi library, you need to add "using mpi ;" in the file "tools/build/v2/user-config.jam" 2. Default installation: - DIR/include/boost_version/boost - DIR/lib/libboost_library_versions.* Create symbolic links: <source lang="bash"> $ cd DIR/include; ln -s boost_version/boost $ cd DIR/lib; ln -s libboost_[library]_[version].[a/so] libboost_[library].[a/so] $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> == Documentation == * [http://www.boost.org/doc/libs/1_42_0/libs/graph/doc/table_of_contents.html Graph] * [http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html Serialization] 72bf40821b94a488da1c829caf2367a9f4917444 693 692 2012-04-11T11:42:25Z Quintin 10 wikitext text/x-wiki http://www.boost.org/ == Installation from sources == 1. By default, boost is configured with all libraries. To save time on building boost, you can configure it ony with the libraries you need: <source lang="bash"> $ ./b2 --prefix=DIR --with-libraries=graph,serialization </source> 2. Default installation: - DIR/include/boost_version/boost - DIR/lib/libboost_library_versions.* Create symbolic links: <source lang="bash"> $ cd DIR/include; ln -s boost_version/boost $ cd DIR/lib; ln -s libboost_[library]_[version].[a/so] libboost_[library].[a/so] $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> To install boost with mpi library, you need to add "using mpi ;" in the file "tools/build/v2/user-config.jam" == Documentation == * [http://www.boost.org/doc/libs/1_42_0/libs/graph/doc/table_of_contents.html Graph] * [http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html Serialization] 6a1a2ca8d0ac76665e19dd93bef677135fa7e8a1 694 693 2012-04-11T14:13:37Z Quintin 10 wikitext text/x-wiki http://www.boost.org/ == Installation from sources == 1. By default, boost is configured with all libraries. To save time on building boost, you can configure it ony with the libraries you need: <source lang="bash"> $ ./b2 --prefix=DIR --with-graph --with-serialization --with-mpi </source> 2. Default installation: - DIR/include/boost_version/boost - DIR/lib/libboost_library_versions.* Create symbolic links: <source lang="bash"> $ cd DIR/include; ln -s boost_version/boost $ cd DIR/lib; ln -s libboost_[library]_[version].[a/so] libboost_[library].[a/so] $ export LD_LIBRARY_PATH=DIR/lib:$LD_LIBRARY_PATH </source> To install boost with mpi library, you need to add "using mpi ;" in the file "tools/build/v2/user-config.jam" == Documentation == * [http://www.boost.org/doc/libs/1_42_0/libs/graph/doc/table_of_contents.html Graph] * [http://www.boost.org/doc/libs/1_42_0/libs/serialization/doc/index.html Serialization] 729096b493ee87cba1e71071b8dca76550db1b63 Gnuplot 0 22 695 678 2012-05-09T10:25:09Z Root 1 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] [http://t16web.lanl.gov/Kawano/gnuplot/index-e.html GNUPLOT: not so Frequently Asked Questions] When plotting "points" data files from fupermod, you will need [http://gnuplot.sourceforge.net/docs_4.2/node172.html this]: set datafile missing "." === Error message "';' expected" === That syntax (linetype specification with just a number, but no keyword) has been deprecated for several years now. It had never been an officially documented feature anyway, and was removed ages ago. Have a look at "help plot style" to see how it's done. [http://groups.google.com/group/comp.graphics.apps.gnuplot/browse_thread/thread/00cb432c02560cf3 More] Deprecated: plot with lines 1 Should be: plot with lines ls 1 38a1b6ffa9536d40ddf53cb845585b1abe07e673 Pgfplot 0 88 696 2012-05-09T11:43:11Z Quintin 10 Created page with "PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion fro…" wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document bdaa5dee2830ed5af0d0e9388510aaf788d36745 697 696 2012-05-09T11:43:47Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document --code-- \usepackage{tikz} \usepackage{pgfplots} 01afb612b0a28a3f4fe18a4997eac7925dcc1baf 698 697 2012-05-09T11:48:15Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt},width=\textwidth,height=7.4cm, xlabel=Number of processors, ylabel=Execution time ($10^3$ s),legend style={at={(.9,.9)}},ymin=0,ymax=8.8,mark size=1.5] %, ytick={1000,2000,3000,4000,5000,6000,7000,8000}] %, %xtick={0,5,10,15,20,30,50},ytick={1000,2000,4000,8000}] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,yexpr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> <source lang="data"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> 3df15de0232c0068442e9fc0ef442b75634ad488 699 698 2012-05-09T11:48:30Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt},width=\textwidth,height=7.4cm, xlabel=Number of processors, ylabel=Execution time ($10^3$ s),legend style={at={(.9,.9)}},ymin=0,ymax=8.8,mark size=1.5] %, ytick={1000,2000,3000,4000,5000,6000,7000,8000}] %, %xtick={0,5,10,15,20,30,50},ytick={1000,2000,4000,8000}] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,yexpr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> <source lang="plain"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> 0de39ea33f3b40456244551305af7b3df415dbdc 700 699 2012-05-09T11:48:46Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt},width=\textwidth,height=7.4cm, xlabel=Number of processors, ylabel=Execution time ($10^3$ s),legend style={at={(.9,.9)}},ymin=0,ymax=8.8,mark size=1.5] %, ytick={1000,2000,3000,4000,5000,6000,7000,8000}] %, %xtick={0,5,10,15,20,30,50},ytick={1000,2000,4000,8000}] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,yexpr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> <source lang="text"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> 05d4c249af4665e3e522e1a5d2181c252c0ac3f9 701 700 2012-05-09T11:49:38Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt},width=\textwidth,height=7.4cm, xlabel=Number of processors, ylabel=Execution time ($10^3$ s),legend style={at={(.9,.9)}},ymin=0,ymax=8.8,mark size=1.5] %, ytick={1000,2000,3000,4000,5000,6000,7000,8000}] %, %xtick={0,5,10,15,20,30,50},ytick={1000,2000,4000,8000}] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,yexpr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> a part of the file "data" is shown just after: <source lang="text"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> b3a384b9cb225d97e5168ebd57a266848533c6b4 702 701 2012-05-09T11:49:59Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt},width=\textwidth,height=7.4cm, xlabel=Number of processors, ylabel=Execution time ($10^3$ s),legend style={at={(.9,.9)}},ymin=0,ymax=8.8,mark size=1.5] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,yexpr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> a part of the file "data" is shown just after: <source lang="text"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> 7c74d315ce66c1c26d3da58fc76c3acd8491f9ed 703 702 2012-05-09T11:53:49Z Quintin 10 wikitext text/x-wiki PgfPlot is a package which let you do the same things as gnuplot but directly inside your latex file. And this has a lot of advantages: - no additional file - no conversion from eps or ps or png to the pdf - the quality is really improved - the picture text has the same font as the rest of the document =Preambule= <source lang="latex"> \usepackage{tikz} \usepackage{pgfplots} </source> =example= <source lang="latex"> \begin{tikzpicture}[transform shape,scale=0.9] \begin{axis}[ ylabel style={yshift=-15pt}, % place of the ylabel width=\textwidth,height=7.4cm, %size of the picture xlabel=Number of processors, ylabel=Execution time ($10^3$ s), legend style={at={(.9,.9)}}, % place of the legend ymin=0, ymax=8.8, % range for the y axe mark size=1.5 % size of the points ] \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \addplot+[] table[header=false,x index=0,y expr=\thisrowno{2}/1000]{data}; \legend{WSCOM$_{\text{pf}}$,WSCOM,list\_min}; \end{axis}\end{tikzpicture} </source> a part of the file "data" is shown just after: <source lang="text"> 1 list_min 8241.04532500001 0.592721444672621 400 2 list_min 4276.264 0.541485563775196 400 3 list_min 2840.83145 0.240012784271948 400 4 list_min 2171.53825 0.204766932333766 400 5 list_min 1771.9397 0.187049380812677 400 </source> b453ac929e20a70b35c16262a1429c22562d76a8 Main Page 0 1 704 682 2012-05-28T09:51:52Z Davepc 2 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [[GridRPC]]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 85a5725d78fcf71b030c2409965422151d1b7bbd Bash Scripts 0 89 705 2012-05-28T09:54:13Z Davepc 2 Created page with "Open in vi all .c files containing a string vim -p `grep STRING *.[c]| cut -f1 -d ":"| uniq`" wikitext text/x-wiki Open in vi all .c files containing a string vim -p `grep STRING *.[c]| cut -f1 -d ":"| uniq` 14c4451867c14c0be9c3125d7709c7553ed821ff 711 705 2012-05-30T14:06:35Z Davepc 2 wikitext text/x-wiki A collection of useful bash scripts here. Open in vi all .c files containing a string vim -p `grep STRING *.[c]| cut -f1 -d ":"| uniq` 641891fa6d7c712948b198f5307b2a65c7a56739 Subversion 0 19 706 691 2012-05-30T13:40:05Z Davepc 2 /* RapidSVN, Gforge &amp; passwords */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN]. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) de8173083b09600f91207cc4ad32c35a44be0617 707 706 2012-05-30T13:58:49Z Davepc 2 /* Subversion for Users */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good linux client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) 6031d8e49bc771dcf836a0a323edb0cae6f7fa84 708 707 2012-05-30T13:59:41Z Davepc 2 /* Subversion for Users */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good cross platform client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) 4ae7c37db170adfdc298b9edf470a330f09df3e4 712 708 2012-06-15T09:55:40Z Davepc 2 /* RapidSVN, Gforge &amp; passwords */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good cross platform client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And gforge.ucd.ie appares not to support passwordless authentication with publickey.&nbsp; '''Solution:''' Use sshpass to remember password. Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc. Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4) edit ~/.subversion/config, in&nbsp;[tunnels] section add the line: gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no Then check out with: svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod (where previously it was:&nbsp;svn checkout svn+ssh) To change an existing working copy svn switch --relocate svn+ssh://&lt;user&gt;@gforge.ucd.ie/&lt;old url&gt; svn+gforge://&lt;user&gt;@gforge.ucd.ie/&lt;new url&gt; 9738d88d718b6c1bdec9cf3098e551a19a483561 Talk:Pgfplot 1 90 709 2012-05-30T14:02:19Z Davepc 2 Created page with "A link to further documentation would be nice here. Possibly a 3rd party site with good examples?" wikitext text/x-wiki A link to further documentation would be nice here. Possibly a 3rd party site with good examples? 481f53fc038304f1a6a5dec82786b8a13e513d63 User:Davepc 2 91 710 2012-05-30T14:05:30Z Davepc 2 Created page with "[http://hcl.ucd.ie/user/david-clarke David Clarke] - HCL PhD student." wikitext text/x-wiki [http://hcl.ucd.ie/user/david-clarke David Clarke] - HCL PhD student. cf2af25a9f54ddfb6f55f43398af355f6dfdc480 HCL cluster 0 5 713 636 2012-07-11T23:13:03Z Davepc 2 /* Software packages available on HCL Cluster 2.0 */ wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == If PBS jobs do not start after a reboot of heterogeneous.ucd.ie it may be necessary to manually start maui: /usr/local/maui/sbin/maui ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * libatlas-base-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot ddc555ca7507c76f77cbba5f2d954c8670d5cd0a UTK multicores + GPU 0 76 714 606 2012-07-12T10:04:59Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 UTK machines == Checking number and type of GPUs == nvidia-smi -L b94795e5ce92991e274b8f42201baf93c53ef6cb 715 714 2012-07-12T10:05:31Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Checking number and type of GPUs == nvidia-smi -L 2db1fef375230b2fc7f9cd54acd3364c87ca32fd 716 715 2012-07-12T10:06:19Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Getting the info of GPUs on a node == nvidia-smi -L 13d6d8fb22125ec7e373dde1752b31e642feb9d3 717 716 2012-07-12T10:11:59Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L 08df0ba5139a456ca342d6172d4816004baf3bd7 718 717 2012-07-12T10:20:57Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid node == *Compiling c738c924255cee5b8abebd51307a923d052758ba 719 718 2012-07-12T10:22:34Z Zhongziming 5 /* Using Fupermod on hybrid node */ wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid node == *Compiling Currently user need to compile the code for CPU and GPU seperately 096f11e88ab4652f28616d349e7c586f004df514 720 719 2012-07-12T10:31:41Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two directories for configuring with cblas (for CPU) cublas (for GPU). For example: /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda a74b83b9e786bfd4986a227260cf472e367d6acc 721 720 2012-07-12T10:33:46Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two directories for configuration with CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda 65a09899b18e4fbfbbe7a5d0d0c8000ba7dd700e 722 721 2012-07-12T10:34:21Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two directories for configuration with CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ mkdir acml_config /* Using acml for CPU compuing*/ cd acml_config ./configure --with-cblas=acml mkdir cuda_config /* Using cublas for GPU computing */ cd cuda_config ./configure --with-cblas=cuda 2ff9cf8c8d5bf41bb8460012ee162c553a53c2c2 723 722 2012-07-12T10:35:31Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda 07ca405717ad618e453f697f806bc0327db7ec4e 724 723 2012-07-12T10:35:51Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda fecc68b0ff37e67c75a395c5224ba2ee53f21d7b 725 724 2012-07-12T10:36:37Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda 1293b04f92fcd9645d38ccf231fb5e2cc4cead75 726 725 2012-07-12T10:37:11Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda fb83984dbcffaf9195a00cce4c4e919c3309c792 727 726 2012-07-12T10:38:06Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda c7b101fac0f3bd69b37069f2a10b444fd45203f0 728 727 2012-07-12T10:38:26Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ /* Using acml for CPU compuing*/ mkdir acml_config cd acml_config ./configure --with-cblas=acml /* Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda 90fb7af8fd7f9f37341e77af3c5e1b8c4a34db1c 729 728 2012-07-12T10:38:57Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: cd fupermod/ ** Using acml for CPU compuing mkdir acml_config cd acml_config ./configure --with-cblas=acml ** Using cublas for GPU computing */ mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda dd2f6a57624e78c03a7db0ec3626fca0cb5b095d 730 729 2012-07-12T10:40:04Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == *Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). **For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda 041886e30e4044f3ca3219d9a59b56df41381b78 731 730 2012-07-12T10:42:16Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Build performance model: /* * rankfile is for process binding * appfile tells what processes will execute */ $ mpirun -rf rankfile -app appfile_fpm b25cece44b1d509939db282b3f13e21fb4b45ca1 732 731 2012-07-12T10:44:06Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, appfile tell mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm 49ddb45adb359a18cf781350ad7099f02f160e3c 733 732 2012-07-12T10:45:38Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 . . . 8fef4e16bd528853e4173a7312d0d5db1390873a 734 733 2012-07-12T10:47:34Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of a appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 #--------------------------------------------------------------------------------------------------------------------------------------------------------------- # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 6bc087b92f57e587be03915d5755604f5c03a303 735 734 2012-07-12T10:48:30Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of a appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 #--------------------------------------------------------------------------------------------------------------------------------------------------------------- # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 6956087e9ddf3c964fa980485ebd3253a5ef6185 736 735 2012-07-12T10:48:45Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of a appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 050e0b8843cd908b532756138910d83dc7606cf3 737 736 2012-07-12T10:49:29Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of a appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # ----------------------------------------------------------------------------- # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 97f475fbca52ab4bf54d98dd53b1bca600da2e7a 738 737 2012-07-12T10:54:07Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of an appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # ----------------------------------------------------------------------------- # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning ** matrix size D = N x N, N = sqrt(D), and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm ** example of an appfile for matrix multiplication # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile #-------------------------------------------------------------------------------------------------------- # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 6f6e4a255295c19a05eb1f967f426a43208caa11 739 738 2012-07-12T10:54:56Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of an appfile for building functional permanence model (FPM): # GPU # # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning ** matrix size D = N x N, N = sqrt(D), and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm ** example of an appfile for matrix multiplication # GPU # # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 27371c15e9807aaf609bbf0061ee055081059020 740 739 2012-07-12T10:56:08Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). ** For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: ** rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm ** example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... ** example of an appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning ** matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm ** example of an appfile for matrix multiplication # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile d1bc2b58e4818448b2a59c211b6cb54c5f1e988c 741 740 2012-07-12T10:58:18Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... An example of an appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm An example of an appfile for matrix multiplication # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile a38cf169f82d77bbf6699d0c3d8aeca5ca999150 UTK multicores + GPU 0 76 742 741 2012-07-12T10:58:56Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... An example of an appfile for building functional permanence model (FPM): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm An example of an appfile for matrix multiplication # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile fa2b286306244a49d044267e35c371904c1f821f 743 742 2012-07-12T11:00:46Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). -For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: -Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... -An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning -Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm -An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 3ba27c7e3d4389fbbaf9fc3d7e4a89b6788cc804 744 743 2012-07-12T11:01:21Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). - For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 39eb3689c0266a64eba91e7868295ef58ea8dbfc 745 744 2012-07-12T11:02:15Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl)and GPU cblas (e.g. cublas). - For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile e2ebd0e7c1296755c786cfba63c42f9ec8903dcb 750 745 2012-07-12T11:06:33Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: Using acml blas for CPU and cublas for GPU computing cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 40584a4abfe52cd595bb8c21b895f95c98512334 751 750 2012-07-12T11:08:06Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with acml blas for CPU and cublas for GPU, and then make cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml make mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda make * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile f26eb49198da1ba06bb7a0a6651172f08ad0eae7 752 751 2012-07-12T11:09:29Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with acml for CPU ([http://developer.amd.com/libraries/acml/pages/default.aspx acml]) and cublas for GPU (), and then make cd fupermod/ mkdir acml_config cd acml_config ./configure --with-cblas=acml make mkdir cuda_config cd cuda_config ./configure --with-cblas=cuda make * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile a35019872e849415536a46067f4cb3053c0e019e 753 752 2012-07-12T11:10:45Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas] for GPU cd fupermod/ mkdir acml_config cd acml_config ../configure --with-cblas=acml make mkdir cuda_config cd cuda_config ../configure --with-cblas=cuda make * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile c330b38c7604d3bef4f77a729570b9c192aead4d 754 753 2012-07-12T11:11:01Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU cd fupermod/ mkdir acml_config cd acml_config ../configure --with-cblas=acml make mkdir cuda_config cd cuda_config ../configure --with-cblas=cuda make * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 6a870af1502791408282cfe9ee25d8a0e49ab42b 755 754 2012-07-12T11:11:53Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for processing binding, and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - An example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - An example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - An example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile be939780d8461a793179c67ca90ba463d3e28c6e 756 755 2012-07-12T12:57:07Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding], and appfile tells mpi what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 359c336e56ee400b94d95986306c83985b81a8f4 757 756 2012-07-16T10:07:28Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding], and appfile tells mpirun what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile be9082f694edcf01e8372c2075110f930856edc7 763 757 2012-08-10T11:29:11Z Quintin 10 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding] (TODO correction this is a dead link), and appfile tells mpirun what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile 57b04a07aa5e865abb8c31d76b712b7d0a2a1bec 765 763 2012-08-10T11:36:24Z Quintin 10 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#toc8 process binding] (TODO correction this is a dead link), and appfile tells mpirun what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_col.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_col.so -o k=640 -U10000 -s10 - TODO this library doesn't exist anymore which one has to be used? -- when we execute it's complaining about the file ./nodetype.hostname.dev what is it? -- I got the error rank_intra >= nodesize, 1 >= 0 could you tell me what is wrong? * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_col.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_col -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_col -k640 -m machinefile b580580085fad4081504e86222d12543b8345661 771 765 2012-08-22T11:36:32Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-cblas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-cblas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect8 process binding], and appfile tells mpirun what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_1d.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_1d.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_1d.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_2d -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_2d -k640 -m machinefile dcb10505d86837f242eeeb52f7f6b5bbfb3a5863 772 771 2012-08-22T22:32:58Z Zhongziming 5 wikitext text/x-wiki == List of machines == http://icl.cs.utk.edu/iclhelp/custom/index.html?lid=97&slid=180 == Display a list of available GPUs == $ nvidia-smi -L == Using Fupermod on hybrid multicore/GPUs node == * Compiling : Create two separate directories for configuration with selected CPU cblas (e.g. gsl, acml, mkl) and GPU cblas (e.g. cublas). - For example: configuring with [http://developer.amd.com/libraries/acml/pages/default.aspx acml] for CPU and [http://developer.nvidia.com/cublas cublas] for GPU $ cd fupermod/ $ mkdir acml_config $ cd acml_config $ ../configure --with-blas=acml $ make $ mkdir cuda_config $ cd cuda_config $ ../configure --with-blas=cuda $ make * Building performance model: - Rankfile is for [http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect8 process binding], and appfile tells mpirun what programs to launch $ mpirun -rf rankfile -app appfile_fpm - Example of a rankfile: rank 0=ig.icl.utk.edu slot=0:0 rank 1=ig.icl.utk.edu slot=0:1 ... - Example of an appfile for building functional permanence model (appfile_fpm): # GPU # e.g. Linking against cublas, and fupermod is configured under cublas_config # suboption g=0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/tools/builder -l $HOME/fupermod/cublas_config/routines/mxm/.libs/libmxm_1d.so -o k=640,g=0 -U10000 -s10 # CPU # e.g. Linking against acml, and fupermod is configured under acml_config -host localhost -np 47 $HOME/fupermod/acml_config/tools/builder -l $HOME/fupermod/acml_config/routines/mxm/.libs/libmxm_1d.so -o k=640 -U10000 -s10 * Data partitioning - Matrix size D = N x N, and machinefile lists the nodes participating in the computing $ fupermod/tools/partitioner -l fupermod/routines/mxm/.libs/libmxm_1d.so -D10000 -o N=100 -m machinefile * Running matrix multiplication $ mpirun -rf rankfile -app appfile_mxm - Example of an appfile for matrix multiplication (appfile_mxm) # GPU # Assuming fupermod is configured under cublas_config, linking against cublas # -g0 means device 0 is selected for computing -host localhost -np 1 $HOME/fupermod/cublas_config/routines/mxm/mxm_2d -k640 -g0 -m machinefile # CPU # Assuming fupermod is configured under acml_config, linking against acml -host localhost -np 47 $HOME/fupermod/acml_config/routines/mxm/mxm_2d -k640 -m machinefile 7cb5e4dbea19bc703edf027e30700e3aa5268acb BLAS LAPACK ScaLAPACK 0 15 746 628 2012-07-12T11:03:28Z Zhongziming 5 wikitext text/x-wiki A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK] * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK). * ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2] * MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation *ACML http://developer.amd.com/libraries/acml/pages/default.aspx Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage] = ScaLAPACK = http://www.netlib.org/scalapack/ 6e3693f20ba58bcb2319d92b5c4f512a317f890f 747 746 2012-07-12T11:04:10Z Zhongziming 5 wikitext text/x-wiki A de facto standard API for linear algebra [http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms BLAS]/[http://en.wikipedia.org/wiki/LAPACK LAPACK] * Original http://www.netlib.org/blas/ http://www.netlib.org/lapack/ - implemented in Fortran. The libraries can be used in C/C++ (so called Fortran interface to BLAS/LAPACK). * ATLAS http://math-atlas.sourceforge.net/ - provides a C interface to BLAS and partially LAPACK. Binary packages: libatlas-[base or platform name, for example sse2] * MKL http://software.intel.com/en-us/intel-mkl/ - Intel implementation *ACML http://developer.amd.com/libraries/acml/pages/default.aspx *CUBLAS http://developer.nvidia.com/cublas Using the C interface is preferable. [http://www.inf.bv.tum.de/~heisserer/softwarelab04/doc/blas_report.pdf BLAS: overview, installation, usage] = ScaLAPACK = http://www.netlib.org/scalapack/ baf644aee95210f32119275cc0736db121126a41 CUDA SDK 0 92 748 2012-07-12T11:04:57Z Zhongziming 5 Created page with "http://developer.nvidia.com/gpu-computing-sdk" wikitext text/x-wiki http://developer.nvidia.com/gpu-computing-sdk 5c5a64675a0b497478f2390202226bcc8efa39cf OpenMPI 0 47 749 657 2012-07-12T11:05:59Z Zhongziming 5 wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfile] == PERUSE == [[Media:current_peruse_spec.pdf|PERUSE Specification]] a6aaf949d0ac094021f56a0341305e400eabe9a3 770 749 2012-08-22T10:45:46Z Xalid 12 /* MCA parameter files */ wikitext text/x-wiki http://www.open-mpi.org/faq/ == MCA parameter files == If you want to permanently use some MCA parameter settings, you can create a file $HOME/.openmpi/mca-params.conf, e.g.: cat $HOME/.openmpi/mca-params.conf btl_tcp_if_exclude = lo,eth1 == Handling SSH key issues == This trick avoids a confirmation message asking "yes" when asked by SSH if a host should be added to known_hosts: ssh -q -o StrictHostKeyChecking=no So with OpenMPI it can be used as mpirun --mca plm_rsh_agent "ssh -q -o StrictHostKeyChecking=no" == Running applications on Multiprocessors/Multicores == Process can be bound to specific sockets and cores on nodes by choosing right options of mpirun. * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect9 Process binding] * [http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php#sect10 Rankfile] == PERUSE == [[Media:current_peruse_spec.pdf|PERUSE Specification]] e225bd3d9d7173af932caf3bd24feb81e988957e MPI 0 29 758 639 2012-07-16T22:30:42Z Davepc 2 /* Profiling */ wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. == Profiling == [http://www.bsc.es/computer-sciences/performance-tools/paraver/general-overview Paraver] by Barcelona Supercomputing Center is a "a flexible performance visualization and analysis tool" 39a98c28d96b2bcf1d753a38702039cd052d9ed1 759 758 2012-07-16T22:40:26Z Davepc 2 /* Profiling */ wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. == Profiling == [http://www.bsc.es/computer-sciences/performance-tools/paraver/general-overview Paraver] by Barcelona Supercomputing Center is a "a flexible performance visualization and analysis tool" [http://www.bsc.es/computer-sciences/performance-tools/downloads Download from here]. Use Extrae to create trace files. mpirun -np 3 ~/bin/trace.sh ./executable Where trace.sh is a script containing: <source lang="bash">#!/bin/bash export EXTRAE_HOME=$HOME export EXTRAE_CONFIG_FILE=$HOME/bin/extrae.xml export LD_LIBRARY_PATH=${EXTRAE_HOME}/lib:@sub_MPI_HOME@/lib:@sub_PAPI_HOME@/lib:@sub_UNWIND_HOME@/lib:$LD_LIBRARY_PATH export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so ## Run the desired program $*</source> 071f6cbf4aed35e957372bac8819b7990ced754d 760 759 2012-07-16T22:59:15Z Davepc 2 wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. == Profiling == [http://www.bsc.es/computer-sciences/performance-tools/paraver/general-overview Paraver] by Barcelona Supercomputing Center is a "a flexible performance visualization and analysis tool" [http://www.bsc.es/computer-sciences/performance-tools/downloads Download from here]. Use Extrae to create trace files. Configered and installed extrae on Grid5000 with: ./configure --prefix=$HOME --with-papi=$HOME --with-mpi=/usr --enable-openmp --with-unwind=$HOME --without-dyninst make; make install Create trace.sh (modified from example extrae file): <source lang="bash">#!/bin/bash export EXTRAE_HOME=$HOME export EXTRAE_CONFIG_FILE=$HOME/bin/extrae.xml export LD_LIBRARY_PATH=${EXTRAE_HOME}/lib:@sub_MPI_HOME@/lib:@sub_PAPI_HOME@/lib:@sub_UNWIND_HOME@/lib:$LD_LIBRARY_PATH export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so ## Run the desired program $*</source> Using the standard extrae.xml supplied with the package. mpirun -np 3 ~/bin/trace.sh ./executable Files created: ''TRACE.mpits, TRACExxxxxx.mpit'' On head node run: mpi2prv -f TRACE.mpits -e ./executable -o output_tracefile.prv On local machine open ''output_tracefile.prv'' with paraver 4db74804d160049f28f78fdd6ae026fe040175e8 761 760 2012-07-16T23:01:54Z Davepc 2 wikitext text/x-wiki == Documentation == * http://www.mpi-forum.org/docs/docs.html == Implementations == * [[LAM]] * [[MPICH]] * [[OpenMPI]] * [[MPICH2]] == Manual installation == Install in separate subfolder <code>$HOME/SUBDIR</code>, because you may need some MPI implementations (see [[Libraries]]) == Tips & Tricks == * For safe consecutive communications create new context, for example: <source lang="C"> int communication_operation(MPI_Comm comm) { MPI_Comm newcomm; MPI_Comm_dup(comm, &newcomm); ... // work with newcomm MPI_Comm_free(&newcomm); } </source> Mind the overhead of <code>MPI_Comm_dup</code> and <code>MPI_Comm_free</code>. * If you are having trouble with the multi-homed nature of the HCL Cluster, check [http://www.open-mpi.org/faq/?category=tcp#tcp-selection here] == Debugging == * Add the following code: <source lang="C"> int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); </source> * Compile your code with <code>-g</code> option * Run parallel application * Attach to process(es) from [[GDB]] ** MPICH-1 runs a background process for each application process: 0, 0b, 1, 1b, ..., therefore, attach to the first ones. == Profiling == [http://www.bsc.es/computer-sciences/performance-tools/paraver/general-overview Paraver] by Barcelona Supercomputing Center is a "a flexible performance visualization and analysis tool" [http://www.bsc.es/computer-sciences/performance-tools/downloads Download here] and [http://www.bsc.es/computer-sciences/performance-tools/documentation tutorials here]. Use Extrae to create trace files. Configered and installed extrae on Grid5000 with: ./configure --prefix=$HOME --with-papi=$HOME --with-mpi=/usr --enable-openmp --with-unwind=$HOME --without-dyninst make; make install Create trace.sh (modified from example extrae file): <source lang="bash">#!/bin/bash export EXTRAE_HOME=$HOME export EXTRAE_CONFIG_FILE=$HOME/bin/extrae.xml export LD_LIBRARY_PATH=${EXTRAE_HOME}/lib:@sub_MPI_HOME@/lib:@sub_PAPI_HOME@/lib:@sub_UNWIND_HOME@/lib:$LD_LIBRARY_PATH export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so ## Run the desired program $*</source> Using the standard extrae.xml supplied with the package. mpirun -np 3 ~/bin/trace.sh ./executable Files created: ''TRACE.mpits, TRACExxxxxx.mpit'' On head node run: mpi2prv -f TRACE.mpits -e ./executable -o output_tracefile.prv On local machine open ''output_tracefile.prv'' with paraver 7e16bd7825e708f5fa6e9bcb172ca592b794dfdc GDB 0 81 762 677 2012-07-23T14:52:09Z Davepc 2 wikitext text/x-wiki Debugging with GDB compile programme with -g -o0 (or ./configure --enable-debug) For serial programme (or MPI running with 1 process) gdb ./programme_name To debug MPI application in parallel add this line to somewhere in the code: if (!rank) getc(stdin); MPI_Barrier(MPI_COMM_WORLD); Then run the code and it will hang on that line. If you get the gdb error: fupermod_....c: No such file or directory. Run the following command to add to gdb's source directories to be searched: directory ~/fupermod/ 6d13da110324a9e90d391f4ce2af388a47813d9f Talk:UTK multicores + GPU 1 93 764 2012-08-10T11:29:34Z Quintin 10 Created page with "Is it possible to get the same thing with mpich?" wikitext text/x-wiki Is it possible to get the same thing with mpich? b6ac7882af6fa1bd23c8fc824abe22a7a5ff3313 Autotools 0 21 766 589 2012-08-16T10:01:32Z Root 1 wikitext text/x-wiki http://en.wikipedia.org/wiki/Autoconf http://sourceware.org/autobook/autobook/autobook_toc.html == Manuals == * http://www.gnu.org/software/autoconf/manual/index.html * http://www.gnu.org/software/automake/manual/index.html * http://www.gnu.org/software/libtool/manual/index.html == Tutorials == * http://www.lrde.epita.fr/~adl/autotools.html (very nice set of slides) == Libraries == * includes (for the include directory): <code>include_HEADERS = ...</code> * library: static <code>lib_LIBRARIES = library.a</code> or dynamic <code>lib_LTLIBRARIES = library.la</code> * sources (internal C data structures and C++ template classes): <code>library_X_SOURCES = library.h ...</code>, where <code>X</code> = <code>a</code> or <code>la</code> [http://gforge.ucd.ie/scm/viewvc.php/*checkout*/trunk/MPIBlib/benchmarks/Makefile.am?root=cpm Example] == Configured headers == Configured headers (created from *.h.in) must not be included into the package, that is <code>include_HEADERS</code> or <code>*_SOURCES</code> * <code>nodist_include_HEADERS = *.h</code> for the configured headers as includes * <code>nodist_*_SOURCES = *.h</code> or <code>BUILT_SOURCES = *.h</code> for the configured headers as sources [http://gforge.ucd.ie/scm/viewvc.php/*checkout*/trunk/MPIBlib/collectives/Makefile.am?root=cpm Example] == Extra files == To add extra files into package, use <code>EXTRA_DIST = *</code>. [http://gforge.ucd.ie/scm/viewvc.php/*checkout*/trunk/MPIBlib/tools/Makefile.am?root=cpm Example] == Conditional building == * http://www.gnu.org/software/hello/manual/automake/Conditionals.html * In the source code, use macros <source lang="C"> #ifdef SYMBOL ... #endif </source> == MPI support == * Define MPI compilers/linkers in configure.ac <source lang="text"> AC_PROG_CC([mpicc]) AC_PROG_CXX([mpic++ mpicxx]) </source> == C/C++ support == * To check C++ features, switch to C++ language in configure.ac <source lang="text"> AC_LANG_PUSH(C++) AC_CHECK_HEADER([header.hpp]) AC_LANG_POP(C++) </source> * To link C code with C++ libraries, add a non-existent C++ file dummy.cpp to sources in Makefile.am <source lang="text"> bin_PROGRAMS = program program_SOURCES = program.c nodist_EXTRA_program_SOURCES = dummy.cpp program_LDADD = cpplibrary.a </source> == Script for downloading & installing recent versions (in March 2010) of m4, libtool, autoconf, automake== #!/bin/bash parent_dir=$PWD export PATH=$HOME/$ARCH/bin:$PATH wget http://ftp.gnu.org/gnu/libtool/libtool-2.2.6b.tar.gz tar xzf libtool-2.2.6b.tar.gz cd libtool-2.2.6b ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/m4/m4-1.4.14.tar.gz tar xfz m4-1.4.14.tar.gz cd m4-1.4.14 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.bz2 tar xjf autoconf-2.65.tar.bz2 cd autoconf-2.65 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir wget http://ftp.gnu.org/gnu/automake/automake-1.10.3.tar.bz2 tar xjf automake-1.10.3.tar.bz2 cd automake-1.10.3 ./configure --prefix=$HOME/$ARCH make install cd $parent_dir 26509d53edd0c0d48857bf4b7a0217b4bbf43545 SSH 0 36 767 560 2012-08-22T10:41:01Z Xalid 12 wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Better than automatically saying "yes" == Remark: It turns out there is a more ellegant way to do this task: using a tool called ''ssh-keyscan''. == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == The best way is saying no "YES" == This ssh trick avoids typing "yes" when asked by SSH if a host should be added to known_hosts: ssh -q -o StrictHostKeyChecking=no So with OpenMPI it can be used as mpirun --mca plm_rsh_agent "ssh -q -o StrictHostKeyChecking=no" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes 922f0a40a35aeacf989d42f35e9cd79d5d015c99 768 767 2012-08-22T10:43:04Z Xalid 12 /* The best way is saying no "YES" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Better than automatically saying "yes" == Remark: It turns out there is a more ellegant way to do this task: using a tool called ''ssh-keyscan''. == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == The best way is saying no "YES" == This ssh trick avoids a confirmation message asking "yes" when asked by SSH if a host should be added to known_hosts: ssh -q -o StrictHostKeyChecking=no So with OpenMPI it can be used as mpirun --mca plm_rsh_agent "ssh -q -o StrictHostKeyChecking=no" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes 2ba52afa0313c83066abe98fe762b7bd1fe4d235 769 768 2012-08-22T10:43:29Z Xalid 12 /* The best way is saying no "YES" */ wikitext text/x-wiki == Passwordless SSH == To set up passwordless SSH, there are three main things to do: * generate a pair of public/private keys on your local computer * copy the public key from the source computer to the target computer's authorized_keys file * check the permissions. You can repeat that transitively for "A->B->C". You can use the initial pair of keys everywhere. See here for details: http://www.stearns.org/doc/ssh-techniques.current.html == Automatically saying "yes" == This expect script automates typing "yes" when asked by SSH if a host should be added to known_hosts #!/usr/bin/expect -f set arg1 [lindex $argv 0] set timeout 2 spawn ssh $arg1 expect "yes/no" { send "yes\n" } send "exit\n" send "\r" You can include it in a bash script to iterate over all nodes doing this: for i in `uniq hostfile` ; do ./say-yes.exp $i done == Better than automatically saying "yes" == Remark: It turns out there is a more ellegant way to do this task: using a tool called ''ssh-keyscan''. == Making a cascade of SSH connections easy == Here is a very convenient way to set up the access to any machine directly instead of doing a cascade of SSH calls. If you can not directly access e.g. the machine "heterogeneous", but you can log into "csserver" and then to "heterogeneous", you can put this into your .ssh/config file : Host csserver User kdichev Hostname csserver.ucd.ie Host heterogeneous User kiril Hostname heterogeneous.ucd.ie ProxyCommand ssh -qax csserver nc %h %p Since the installation of a new PBS system, you can not directly log into a hclXX node. You can do ssh heterogeneous instead and use "qsub" [[HCL_cluster#Access_and_Security]] == The best way is saying no "YES" == This trick avoids a confirmation message asking "yes" when asked by SSH if a host should be added to known_hosts: ssh -q -o StrictHostKeyChecking=no So with OpenMPI it can be used as mpirun --mca plm_rsh_agent "ssh -q -o StrictHostKeyChecking=no" == X11 forwarding == <code lang="bash"> ssh -X hostname </code> or add the following line to your .ssh/ssh_config file ForwardX11 yes 762ca711326093974281898304b49fcbc7ffb9f5 Grid5000 0 6 773 681 2012-08-24T15:38:28Z Xalid 12 /* Login, job submission, deployment of image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r '2012-08-24 19:30:00' -l nodes=16,walltime=12 -p "cluster='Genepi'" </source> *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 7b8a57c223d90580506e736cfab41cc3e8c6b232 774 773 2012-08-24T15:41:03Z Xalid 12 /* Login, job submission, deployment of image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r '2012-08-24 19:30:00' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... b959b49e12364ab6eb95c915a24db6a35ecd76e3 775 774 2012-08-24T15:41:54Z Xalid 12 /* Login, job submission, deployment of image */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 920f71bebc9ed54929c00cb4f6fadf5a4959eb81 777 775 2012-09-19T10:52:30Z Davepc 2 /* GotoBLAS2 */ wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... 29ab42eafe0b951024d81fa834791303cbf85310 BitTorrent (B. Cohen's version) 0 79 776 653 2012-09-14T12:27:39Z Kiril 3 wikitext text/x-wiki The modified bittorrent tarball is currently under the MPIBlib repository: svn co https://hcl.ucd.ie/repos/CPM/trunk/MPIBlib/tests/bittorrent/bittorrent.tar.gz * Extract this under your home directory ** You might need mv $HOME/lib/python $HOME/lib/python2.6 * export following variables to tweak the way Python paths are searched: export PYTHONPATH=/usr/lib/python2.6/:<local-installation-path-of-Python-libs> * Modify logfile path in /home/kdichev/lib/python2.6/BitTorrent/StorageWrapper.py and create according directory structure * Create a file of any size consisting of "s" characters only. * Create torrent file: btmakemetafile myfile.ext http://<frontend on G5K>:6969/announce * Stick the torrent file into $HOME/public at the frontend. ** It is then available from within G5K under http://public.lille.grid5000.fr/~kdichev/{torrent file} * Start the tracker on the frontend: bttrack --port 6969 --dfile dstate * Book nodes interactively: qsub -I -lnodes=... * Start one client that has the full copy ssh <one-node> btdownloadheadless --url http://public.lille.grid5000.fr/~kdichev/{torrent file} & * On all other nodes, launch the client at the same time: script-per-process.sh #!/bin/bash cd /tmp ; btdownloadheadless --display_interval 100 --url http://public.lille.grid5000.fr/~kdichev/{torrent file} run.sh mpirun -n 14 --machinefile hostfile $PWD/script-per-process.sh 03282ab675a458fb062eb4cae4d702b4651fe05c Gnuplot 0 22 778 695 2012-10-11T01:40:04Z Davepc 2 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] [http://t16web.lanl.gov/Kawano/gnuplot/index-e.html GNUPLOT: not so Frequently Asked Questions] When plotting "points" data files from fupermod, you will need [http://gnuplot.sourceforge.net/docs_4.2/node172.html this]: set datafile missing "." Put in a multiplication symbol with {/Symbol \264} [http://gnuplot-tricks.blogspot.ie/2009/05/gnuplot-tricks-many-say-that-it-is.html:here] and [http://quark.phys.s.u-tokyo.ac.jp/~kawanai/file/guide.pdf:here] === Error message "';' expected" === That syntax (linetype specification with just a number, but no keyword) has been deprecated for several years now. It had never been an officially documented feature anyway, and was removed ages ago. Have a look at "help plot style" to see how it's done. [http://groups.google.com/group/comp.graphics.apps.gnuplot/browse_thread/thread/00cb432c02560cf3 More] Deprecated: plot with lines 1 Should be: plot with lines ls 1 9e54a1bc4b9784a563561771e0bfb37236055ef3 779 778 2012-10-11T01:40:26Z Davepc 2 wikitext text/x-wiki [http://www.gnuplot.info/documentation.html Official gnuplot documentation] [http://gnuplot.sourceforge.net/demo/ Demo scripts for gnuplot] [http://t16web.lanl.gov/Kawano/gnuplot/index-e.html GNUPLOT: not so Frequently Asked Questions] When plotting "points" data files from fupermod, you will need [http://gnuplot.sourceforge.net/docs_4.2/node172.html this]: set datafile missing "." Put in a multiplication symbol with {/Symbol \264} [http://gnuplot-tricks.blogspot.ie/2009/05/gnuplot-tricks-many-say-that-it-is.html here] and [http://quark.phys.s.u-tokyo.ac.jp/~kawanai/file/guide.pdf here] === Error message "';' expected" === That syntax (linetype specification with just a number, but no keyword) has been deprecated for several years now. It had never been an officially documented feature anyway, and was removed ages ago. Have a look at "help plot style" to see how it's done. [http://groups.google.com/group/comp.graphics.apps.gnuplot/browse_thread/thread/00cb432c02560cf3 More] Deprecated: plot with lines 1 Should be: plot with lines ls 1 131d2c1f115c711093d7b092bf90c55c10c695bf C/C++ 0 14 780 568 2012-11-01T13:10:27Z Davepc 2 /* Tips & Tricks */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == Tips &amp; Tricks == *[http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] *Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] *[http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) *[http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] *Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> *Implement delays in the execution of the program with help of [http://linux.die.net/man/2/nanosleep nanosleep]. Compared to sleep and usleep, nanosleep has the advantage of not affecting any signals, it is standardized by POSIX, it provides higher timing resolution, and it allows to continue a sleep that has been interrupted by a signal more easily. *Indenting in fupermod is done in the [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Spaces_vs._Tabs#Spaces_vs._Tabs google code style], two literal spaces, no tabs. To set vim to do this put the following in .vimrc: set autoindent set expandtab set tabstop=2 set shiftwidth=2 set softtabstop=2 *To indent all .c and .h files with vim use the following ([http://stackoverflow.com/questions/3218528/indenting-in-vim-with-all-the-files-in-folder explained here]): :args ./*/*.[ch] | argdo execute "normal gg=G" | update 19b9b4a234ac659fb3e7da5a756b229e1ffd93e4 781 780 2012-11-01T13:12:30Z Davepc 2 /* Tips &amp; Tricks */ wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == Tips &amp; Tricks == *[http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] *Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] *[http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) *[http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] *Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> *Implement delays in the execution of the program with help of [http://linux.die.net/man/2/nanosleep nanosleep]. Compared to sleep and usleep, nanosleep has the advantage of not affecting any signals, it is standardized by POSIX, it provides higher timing resolution, and it allows to continue a sleep that has been interrupted by a signal more easily. *Indenting in fupermod is done in the [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Spaces_vs._Tabs#Spaces_vs._Tabs google code style], two literal spaces, no tabs. To set vim to do this put the following in .vimrc: set autoindent set expandtab set tabstop=2 set shiftwidth=2 set softtabstop=2 *To indent all .c and .h files with vim use the following ([http://stackoverflow.com/questions/3218528/indenting-in-vim-with-all-the-files-in-folder explained here]): :args ./*/*.[ch] | argdo execute "normal gg=G" | update or use the Unix command $ indent 941ecf1b3a38f661500a6e815d55cdf9be0800de Subversion 0 19 782 712 2012-11-01T16:26:50Z Davepc 2 /* RapidSVN, Gforge &amp; passwords */ wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good cross platform client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == &lt;strike&lt;/strike&gt; '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And &nbsp;<strike>gforge.ucd.ie appares not to support passwordless authentication with publickey</strike>. There was a bug in the cron job that updated the keys.<strike></strike> '''Solution:''' &nbsp; ssh gforge.ucd.ie &nbsp; chmod 700 .ssh And add your key (.ssh/id_rsa.pub) to http://gforge.ucd.ie/&nbsp; (and wait for cron job to add it, or do it manually) <strike>'''Solution:''' Use sshpass to remember password.</strike> <strike>Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc.</strike> <strike>Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4)</strike> <strike>edit ~/.subversion/config, in&nbsp;[tunnels] section add the line:</strike> <strike>gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no </strike> <strike>Then check out with:</strike> <strike>svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod </strike> <strike>(where previously it was:&nbsp;svn checkout svn+ssh)</strike> <strike>To change an existing working copy</strike> <strike>svn switch --relocate svn+ssh://&lt;user&gt;@gforge.ucd.ie/&lt;old url&gt; svn+gforge://&lt;user&gt;@gforge.ucd.ie/&lt;new url&gt;</strike> 17244978e8829b33d5093ae59df774d2412a57fc 783 782 2012-11-01T16:27:04Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good cross platform client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And &nbsp;<strike>gforge.ucd.ie appares not to support passwordless authentication with publickey</strike>. There was a bug in the cron job that updated the keys.<strike></strike> '''Solution:''' &nbsp; ssh gforge.ucd.ie &nbsp; chmod 700 .ssh And add your key (.ssh/id_rsa.pub) to http://gforge.ucd.ie/&nbsp; (and wait for cron job to add it, or do it manually) <strike>'''Solution:''' Use sshpass to remember password.</strike> <strike>Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc.</strike> <strike>Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4)</strike> <strike>edit ~/.subversion/config, in&nbsp;[tunnels] section add the line:</strike> <strike>gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no </strike> <strike>Then check out with:</strike> <strike>svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod </strike> <strike>(where previously it was:&nbsp;svn checkout svn+ssh)</strike> <strike>To change an existing working copy</strike> <strike>svn switch --relocate svn+ssh://&lt;user&gt;@gforge.ucd.ie/&lt;old url&gt; svn+gforge://&lt;user&gt;@gforge.ucd.ie/&lt;new url&gt;</strike> db8fbe04fc53c7367a05d48205334ca755067b09 784 783 2012-11-01T16:27:36Z Davepc 2 wikitext text/x-wiki http://svnbook.red-bean.com/ *Subversion clients work with <code>.svn</code> directories - don't remove them. *Mind the version of the client (currently, 1.5, 1.6). == Repositories == *http://gforge.ucd.ie/softwaremap/tag_cloud.php?tag=heterogeneous+computing == To submit == *Software sources: models, code, resource files *Documentation sources: texts, diagrams, data *Configuration files *Test sourses: code, input data == Not to submit == *Binaries: object files, libraries, executables *Built documentation: html, pdf *Personal settings: Eclipse projects, ... *Test output = Subversion for Users = A good cross platform client: [http://www.rapidsvn.org/index.php/Documentation RapidSVN], combined with [http://meldmerge.org/ Meld] a visual diff and merge tool. == RapidSVN, Gforge &amp; passwords == '''Problem:''' RapidSVN doesn't directly support svn over ssh and so doesn't remember ssh passwords. And &nbsp;<strike>gforge.ucd.ie appares not to support passwordless authentication with publickey</strike>. There was a bug in the cron job that updated the keys.<strike></strike> '''Solution:''' &nbsp; ssh gforge.ucd.ie &nbsp; chmod 700 .ssh And add your key (.ssh/id_rsa.pub) to http://gforge.ucd.ie/ (and wait for cron job to add it, or do it manually) <strike>'''Solution:''' Use sshpass to remember password.</strike> <strike>Note: this method involves having your gforge password in plain text, and so is a potential security risk - it should be different to other passwords etc.</strike> <strike>Install sshpass &gt;=1.05 (note ubuntu 11.10 usese version 1.04 which just hangs - so install from sources or Ubuntu 12.4)</strike> <strike>edit ~/.subversion/config, in&nbsp;[tunnels] section add the line:</strike> <strike>gforge = sshpass -f{path to file holding password} ssh -o PubkeyAuthentication=no -o ControlMaster=no </strike> <strike>Then check out with:</strike> <strike>svn checkout svn+gforge://&lt;user&gt;@gforge.ucd.ie/var/lib/gforge/chroot/scmrepos/svn/fupermod/trunk fupermod </strike> <strike>(where previously it was:&nbsp;svn checkout svn+ssh)</strike> <strike>To change an existing working copy</strike> <strike>svn switch --relocate svn+ssh://&lt;user&gt;@gforge.ucd.ie/&lt;old url&gt; svn+gforge://&lt;user&gt;@gforge.ucd.ie/&lt;new url&gt;</strike> f7d0203d9092dd544f889c04192fc02655b4cdfa PGF/Tikz 0 85 785 683 2012-11-07T17:43:50Z Davepc 2 wikitext text/x-wiki * [http://en.wikipedia.org/wiki/PGF/TikZ Tikz on Wikipedia] * [http://www.ctan.org/tex-archive/graphics/pgf/base/doc/generic/pgf/pgfmanual.pdf PGF manual] = Write a figure = The preamble of the latex file must contain: <source lang="latex">\usepackage{tikz}</source> Some optional libraries could be add like this: <source lang="latex">\usetikzlibrary{calc}</source> To start a figure, the code must be inside the tikzpicture environment like this: <source lang="latex">\begin{tikzpicture} ... TikZ code here... \end{tikzpicture}</source> = Exemple = <source lang="latex"> % Author: Quintin Jean-Noël % <http://moais.imag.fr/membres/jean-noel.quintin/> \documentclass{article} \usepackage{tikz} \usetikzlibrary[topaths] % A counter, since TikZ is not clever enough (yet) to handle % arbitrary angle systems. \newcount\mycount \begin{document} \begin{tikzpicture}[transform shape] %the multiplication with floats is not possible. Thus I split the loop %in two. \foreach \number in {1,...,8}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 0 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {9,...,16}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 22.5 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {1,...,15}{ \mycount=\number \advance\mycount by 1 \foreach \numbera in {\the\mycount,...,16}{ \path (N-\number) edge[->,bend right=3] (N-\numbera) edge[<-,bend left=3] (N-\numbera); } } \end{tikzpicture} \end{document} </source> = Voir aussi = *[[Pfgplot]] 06fe590abab5ba1441f2bbe0dfc561f8d2906479 786 785 2012-11-07T17:46:31Z Davepc 2 wikitext text/x-wiki * [http://en.wikipedia.org/wiki/PGF/TikZ Tikz on Wikipedia] * [http://www.ctan.org/tex-archive/graphics/pgf/base/doc/generic/pgf/pgfmanual.pdf PGF manual] * [http://www.texample.net/tikz/examples/ Examples] = Write a figure = The preamble of the latex file must contain: <source lang="latex">\usepackage{tikz}</source> Some optional libraries could be add like this: <source lang="latex">\usetikzlibrary{calc}</source> To start a figure, the code must be inside the tikzpicture environment like this: <source lang="latex">\begin{tikzpicture} ... TikZ code here... \end{tikzpicture}</source> = Exemple = <source lang="latex"> % Author: Quintin Jean-Noël % <http://moais.imag.fr/membres/jean-noel.quintin/> \documentclass{article} \usepackage{tikz} \usetikzlibrary[topaths] % A counter, since TikZ is not clever enough (yet) to handle % arbitrary angle systems. \newcount\mycount \begin{document} \begin{tikzpicture}[transform shape] %the multiplication with floats is not possible. Thus I split the loop %in two. \foreach \number in {1,...,8}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 0 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {9,...,16}{ % Computer angle: \mycount=\number \advance\mycount by -1 \multiply\mycount by 45 \advance\mycount by 22.5 \node[draw,circle,inner sep=0.125cm] (N-\number) at (\the\mycount:5.4cm) {}; } \foreach \number in {1,...,15}{ \mycount=\number \advance\mycount by 1 \foreach \numbera in {\the\mycount,...,16}{ \path (N-\number) edge[->,bend right=3] (N-\numbera) edge[<-,bend left=3] (N-\numbera); } } \end{tikzpicture} \end{document} </source> = Voir aussi = *[[Pfgplot]] b0d3b5b830127a448d2ed5c5afdac290842d16bf Main Page 0 1 787 704 2012-11-15T12:00:30Z Root 1 wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [http://en.wikipedia.org/wiki/GridRPC GridRPC]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 5b9fb4989400abfb78b41ba0515668cc2a36f2fd 789 787 2012-12-06T18:21:56Z Davepc 2 /* Hardware */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [http://en.wikipedia.org/wiki/GridRPC GridRPC]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] * [[Desktop Backup]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 0b80d91cd81bee241f0b1d97bebab8cbc779c6eb ChangeLog 0 18 788 39 2012-11-15T12:01:17Z Root 1 wikitext text/x-wiki * either <code>svn log -v > ChangeLog</code> with [[Subversion]] * or http://en.wikipedia.org/wiki/Changelog with the Linuxtools plugin in [[Eclipse]] 8344bf665de7ff66c1377d24b3746aab8efc983f Desktop Backup 0 94 790 2012-12-06T18:30:32Z Davepc 2 Created page with "Members of the HCL group may backup their desktops to heterogeneous server in the following directory /home/desktops/&lt;user&gt; This can easly be done with rsync as follow…" wikitext text/x-wiki Members of the HCL group may backup their desktops to heterogeneous server in the following directory /home/desktops/&lt;user&gt; This can easly be done with rsync as follows rsync -axv /home/&lt;your_desktop_username&gt;/ &lt;user&gt;@heterogeneous:/home/desktops/&lt;user&gt;/ --exclude-from=.bkup_excludes and make the file .backup_excludes with files and directorys you would like to exclude, for example your download folder. An example of Daves excludes file: <source lang="text">.Skype/ .Trash-1000/ .adobe/ .cache/ .config/chromium .dropbox/ .mozilla/ .ssh/ .svn/ .thumbnails/ .thunderbird/ Downloads/ Dropbox/ backups/</source> 40e827c0d00c71ab4cb5e04b7d95053e865a3977 791 790 2012-12-06T18:32:33Z Davepc 2 wikitext text/x-wiki Members of the HCL group may backup their desktops to heterogeneous server in the following directory heterogeneous:/home/desktops/&lt;user&gt; This can easly be done with rsync as follows rsync -axv /home/&lt;your_desktop_username&gt;/ &lt;user&gt;@heterogeneous:/home/desktops/&lt;user&gt;/ --exclude-from=.bkup_excludes and make the file .backup_excludes with files and directorys you would like to exclude, for example your download folder. An example of Daves excludes file: .Skype/ .Trash-1000/ .adobe/ .cache/ .config/chromium .dropbox/ .mozilla/ .ssh/ .svn/ .thumbnails/ .thunderbird/ Downloads/ Dropbox/ backups/ 4a4effbbb720a2ca8296b79ff2f368a078b4da4f Desktop Backup 0 94 792 791 2012-12-06T18:32:56Z Davepc 2 wikitext text/x-wiki Members of the HCL group may backup their desktops to heterogeneous server in the following directory: heterogeneous:/home/desktops/&lt;user&gt; This can easly be done with rsync as follows: rsync -axv /home/&lt;your_desktop_username&gt;/ &lt;user&gt;@heterogeneous:/home/desktops/&lt;user&gt;/ --exclude-from=.bkup_excludes and make the file .backup_excludes with files and directorys you would like to exclude, for example your download folder. An example of Daves excludes file: .Skype/ .Trash-1000/ .adobe/ .cache/ .config/chromium .dropbox/ .mozilla/ .ssh/ .svn/ .thumbnails/ .thunderbird/ Downloads/ Dropbox/ backups/ 4e7703eb55e47cb3fc4ecf814a446669362b76a9 793 792 2012-12-06T18:35:41Z Davepc 2 wikitext text/x-wiki Members of the HCL group may backup their desktops to heterogeneous server in the following directory: heterogeneous:/home/desktops/&lt;user&gt; This can easly be done with rsync as follows: rsync -axv /home/&lt;your_desktop_username&gt;/ &lt;user&gt;@heterogeneous:/home/desktops/&lt;user&gt;/ --exclude-from=.bkup_excludes and make the file .backup_excludes with files and directorys you would like to exclude, for example your download folder, internet cache, etc. An example of Daves excludes file: .Skype/ .Trash-1000/ .adobe/ .cache/ .config/chromium .dropbox/ .mozilla/ .ssh/ .svn/ .thumbnails/ .thunderbird/ Downloads/ Dropbox/ backups/ b1c3e63f0cc8b9cbb3ea140368887642ff339b98 Main Page 0 1 794 789 2013-03-19T14:48:39Z Xalid 12 /* Hardware */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [http://en.wikipedia.org/wiki/GridRPC GridRPC]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] * [[BlueGene/P]] * [[Desktop Backup]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 3c21a6abea7214cc2e9fdcf7cff261f092b1ed39 814 794 2013-09-09T13:54:37Z Davepc 2 /* Hardware */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [http://en.wikipedia.org/wiki/GridRPC GridRPC]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] * [[BlueGene/P]] * [[Desktop Backup]] * [[Memory size, overcommit, limit]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 8512aa9772d18a8195347abb4f0ed083a0b85c43 827 814 2013-11-06T10:24:09Z Root 1 /* Paper & Presentation Tools */ wikitext text/x-wiki This site is set up for sharing ideas, findings and experience in heterogeneous computing. Please, log in and create new or edit existing pages. How to format wiki-pages read [[Help:Editing|here]]. == HCL software for heterogeneous computing == * Extensions for [[MPI]]: [http://hcl.ucd.ie/project/mpC mpC] [http://hcl.ucd.ie/project/HeteroMPI HeteroMPI] [http://hcl.ucd.ie/project/libELC libELC] * Extensions for [http://en.wikipedia.org/wiki/GridRPC GridRPC]: [http://hcl.ucd.ie/project/SmartGridSolve SmartGridSolve] [http://hcl.ucd.ie/project/NI-Connect NI-Connect] * Computation benchmarking, modeling, dynamic load balancing: [http://hcl.ucd.ie/project/fupermod FuPerMod] [http://hcl.ucd.ie/project/pmm PMM] * Communication benchmarking, modeling, optimization: [http://hcl.ucd.ie/project/cpm CPM] [http://hcl.ucd.ie/project/mpiblib MPIBlib] == Heterogeneous mathematical software == * [http://hcl.ucd.ie/project/HeteroScaLAPACK HeteroScaLAPACK] * [http://hcl.ucd.ie/project/Hydropad Hydropad] == Operating systems == * [[Linux]] * [[Windows]] == Development tools == * [[C/C++]], [[Python]], [[UML]], [[FORTRAN]] * [[Autotools]] * [[GDB]], [[OProfile]], [[Valgrind]] * [[Doxygen]] * [[ChangeLog]], [[Subversion]] * [[Eclipse]] * [[Bash Scripts]] == [[Libraries]] == * [[GNU C Library]] * [[MPI]] * [[STL]], [[Boost]] * [[GSL]] * [[BLAS LAPACK ScaLAPACK]] * [[NLOPT]] * [[BitTorrent (B. Cohen's version)]] * [[CUDA SDK]] == Data processing == * [[gnuplot]], [[pgfplot]], [[matplotlib]] * [[Graphviz]] * [[Octave]], [[R]] * [[G3DViewer]] == Paper & Presentation Tools == * [[Dia]], [[PGF/Tikz]], [[pgfplot]] * [[LaTeX]], [[Beamer]] * [[BibTeX]], [[JabRef]] HCL templates for slides and posters in the HCL publications repository trunk/templates == Hardware == * [[HCL cluster]] * [[Other UCD Resources]] * [[UTK multicores + GPU]] * [[Grid5000]] * [[BlueGene/P]] * [[Desktop Backup]] * [[Memory size, overcommit, limit]] [[SSH|How to connect to cluster via SSH]] [[hwloc|How to find information about the hardware]] == Mathematics == * [http://en.wikipedia.org/wiki/Confidence_interval Confidence interval (Statistics)], [http://en.wikipedia.org/wiki/Student's_t-distribution Student's t-distribution] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Linear_regression Linear regression] (implemented in [[GSL]]) * [http://en.wikipedia.org/wiki/Binomial_tree#Binomial_tree Binomial tree] (use [[Graphviz]] to visualize trees) * [http://en.wikipedia.org/wiki/Spline_interpolation Spline interpolation], [http://en.wikipedia.org/wiki/B-spline Spline approximation] (implemented in [[GSL]]) 53cf4d85e392cb2eb00e9bc9b4ddb238dead6735 BlueGene/P 0 95 795 2013-03-19T15:00:15Z Xalid 12 Created page with "Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition,…" wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). 15ca6a50c0c13665430ece03c112fe280f25e4f1 796 795 2013-03-19T15:02:10Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). <br> 879cac0bbfe92197df885cf460ec246ba26195a7 797 796 2013-03-22T17:56:33Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). ====== Fupermod on Shaheen BlueGene/P ====== In order to compile fupermod on the BG/P the following commands should be run to load some libraries: #module load bluegene #module load essl #module load gsl Then, configure command can be executed as follows: /fupermod/configure --with-gsl-dir=/opt/share/math_libraries/gsl/ppc64/IBM --with-blas=essl CFLAGS="-O3 -qarch=450 -qtune=450" --with-essl-dir=/opt/share/ibmmath/essl/4.4/ Somehow, on BG/P autotools didn't see LD_LIBRARY_PATH. Therefore, the following hardcoded path was added into configure.ac. if test "$with_essl_dir" != ""; then<br>&nbsp; &nbsp; &nbsp; CPPFLAGS="$CPPFLAGS -I$with_essl_dir/include"<br>&nbsp; &nbsp; &nbsp; &nbsp;LDFLAGS="$LDFLAGS -L$with_essl_dir/lib '''-L/opt/ibmcmp/xlf/bg/11.1/lib'''"<br> fi<br> 9e6a4ce14845e221d63fefb78ba2af90b92b8d23 798 797 2013-03-22T17:57:36Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). ===== Fupermod on Shaheen BlueGene/P ===== In order to compile fupermod on the BG/P the following commands should be run to load some libraries: #module load bluegene #module load essl #module load gsl Then, configure command can be executed as follows: /fupermod/configure --with-gsl-dir=/opt/share/math_libraries/gsl/ppc64/IBM --with-blas=essl CFLAGS="-O3 -qarch=450 -qtune=450" --with-essl-dir=/opt/share/ibmmath/essl/4.4/ Somehow, on BG/P autotools didn't see LD_LIBRARY_PATH. Therefore, the following hardcoded path was added into configure.ac. if test "$with_essl_dir"&nbsp;!= ""; then<br>&nbsp; &nbsp; &nbsp; CPPFLAGS="$CPPFLAGS -I$with_essl_dir/include"<br>&nbsp; &nbsp; &nbsp; &nbsp;LDFLAGS="$LDFLAGS -L$with_essl_dir/lib '''-L/opt/ibmcmp/xlf/bg/11.1/lib'''"<br> fi<br> 67cb69e890d597d52859933bee62bb38734ac5e8 799 798 2013-03-22T17:59:29Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). ===== Fupermod on Shaheen BlueGene/P ===== In order to compile fupermod on the BG/P the following commands should be run to load some libraries: #module load bluegene #module load essl #module load gsl Then, configure command can be executed as follows: /fupermod/configure --with-gsl-dir=/opt/share/math_libraries/gsl/ppc64/IBM --with-blas=essl CFLAGS="-O3 -qarch=450 -qtune=450" --with-essl-dir=/opt/share/ibmmath/essl/4.4/ Somehow, on BG/P autotools didn't see LD_LIBRARY_PATH. Therefore, the following hardcoded path was added into configure.ac. &lt;!-- if test "$with_essl_dir"&nbsp;!= ""; then<br>&nbsp; &nbsp; &nbsp; CPPFLAGS="$CPPFLAGS -I$with_essl_dir/include"<br>&nbsp; &nbsp; &nbsp; &nbsp;LDFLAGS="$LDFLAGS -L$with_essl_dir/lib '''-L/opt/ibmcmp/xlf/bg/11.1/lib'''"<br> fi<br> --&gt; 3e1f14b991de0bc5887092cc2aac6083cc4292b4 800 799 2013-03-22T18:00:16Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). ===== Fupermod on Shaheen BlueGene/P ===== In order to compile fupermod on the BG/P the following commands should be run to load some libraries: #module load bluegene #module load essl #module load gsl Then, configure command can be executed as follows: /fupermod/configure --with-gsl-dir=/opt/share/math_libraries/gsl/ppc64/IBM --with-blas=essl CFLAGS="-O3 -qarch=450 -qtune=450" --with-essl-dir=/opt/share/ibmmath/essl/4.4/ Somehow, on BG/P autotools didn't see LD_LIBRARY_PATH. Therefore, the following hardcoded path was added into configure.ac. &lt;!-- rss --&gt; if test "$with_essl_dir"&nbsp;!= ""; then<br>&nbsp; &nbsp; &nbsp; CPPFLAGS="$CPPFLAGS -I$with_essl_dir/include"<br>&nbsp; &nbsp; &nbsp; &nbsp;LDFLAGS="$LDFLAGS -L$with_essl_dir/lib '''-L/opt/ibmcmp/xlf/bg/11.1/lib'''"<br> fi<br> <br> --&gt; d1b01b63e781c36206fb49c85727e66d2ea14b7c 801 800 2013-03-22T18:01:18Z Xalid 12 wikitext text/x-wiki Some members of HCL have access to Shaheen BlueGene/P at King Abdullah University of Science and Technology ([http://www2.hpc.kaust.edu.sa/documentation/shaheen/]) . In addition, some members has access to BleGene/P at West University of Timisoara, Romania ([http://hpc.uvt.ro/infrastructure/bluegenep/]). ===== Fupermod on Shaheen BlueGene/P ===== In order to compile fupermod on the BG/P the following commands should be run to load some libraries: #module load bluegene #module load essl #module load gsl Then, configure command can be executed as follows: /fupermod/configure --with-gsl-dir=/opt/share/math_libraries/gsl/ppc64/IBM --with-blas=essl CFLAGS="-O3 -qarch=450 -qtune=450" --with-essl-dir=/opt/share/ibmmath/essl/4.4/<br> 8a8f63ef8996aef6ae5fdcaeb58f25a97b5a5ecb C/C++ 0 14 802 781 2013-04-09T17:47:34Z Davepc 2 wikitext text/x-wiki == Coding == * C++ programming style is preferrable. For example, in variable declarations, pointers and references should have their reference symbol next to the type rather than to the name. Variables should be initialized where they are declared, and should be declared where they are used. For more details, see [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml Google C++ Style Guide] * [http://en.wikipedia.org/wiki/Indent_style#Variant:_1TBS One-true-brace ident style] * [http://en.wikipedia.org/wiki/Pragma_once Coding header files] * Learn from examples and use coding approaches from third-party software == Commenting == * Place [[Doxygen]] comments in header files (before declarations of namespaces/classes/structs/typedefs/macros) and main source files (for documenting tools and tests) * Use double forward slash for short comments in the code == C++ == * [http://developers.sun.com/solaris/articles/mixing.html Mixing C/C++] * Provide main API in C * Use plain C unless you need flexible data structures or [[STL]]/[[Boost]] functionality * [http://en.wikipedia.org/wiki/Template_metaprogramming Template C++] is preferrable from the point of view of runtime performance * Mind the life cycle of objects: [http://en.wikipedia.org/wiki/Default_constructor Default constructor] [http://en.wikipedia.org/wiki/Copy_constructor Copy constructor], [http://en.wikipedia.org/wiki/Destructor_(computer_science) Destructor] * [http://www.gnu.org/software/hello/manual/automake/Libtool-Convenience-Libraries.html Force C++ linking] == Tips &amp; Tricks == *[http://www.gnu.org/s/libc/manual/html_node/Date-and-Time.html#Date-and-Time Timing in C] *Don't use non-standard functions, like [http://en.wikipedia.org/wiki/Itoa itoa] *[http://www.gnu.org/software/libc/manual/html_node/Program-Arguments.html Handling program arguments] (avoid <code>argp</code> since it is not supported on many platforms) *[http://en.wikipedia.org/wiki/Dynamic_loading Dynamic loading of shared libraries] *Avoid [http://en.wikipedia.org/wiki/Variable-length_array variable-length arrays]. First, GCC allocates them on the stack. Second, the status of this feature in GCC is BROKEN. Therefore, never do this: <source lang="C"> int size; MPI_Comm_size(MPI_COMM_WORLD, &size); char names[size][MPI_MAX_PROCESSOR_NAME]; </source> *Implement delays in the execution of the program with help of [http://linux.die.net/man/2/nanosleep nanosleep]. Compared to sleep and usleep, nanosleep has the advantage of not affecting any signals, it is standardized by POSIX, it provides higher timing resolution, and it allows to continue a sleep that has been interrupted by a signal more easily. *Indenting in fupermod is done in the [http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml?showone=Spaces_vs._Tabs#Spaces_vs._Tabs google code style], two literal spaces, no tabs. To set vim to do this put the following in .vimrc: set autoindent set expandtab set tabstop=2 set shiftwidth=2 set softtabstop=2 *To indent all .c and .h files with vim use the following ([http://stackoverflow.com/questions/3218528/indenting-in-vim-with-all-the-files-in-folder explained here]): :args ./*/*.[ch] | argdo execute "normal gg=G" | update or use the Unix command $ indent == Color GCC == Colours output of GCC so you can see errors and warnings. sudo apt-get install colorgcc ln -s /usr/bin/colorgcc ~/bin/gcc *Make sure ~/bin is in path _before_ gcc. (Add ~/bin to path in ~/.profile) f03cb238e40ed017400f74857476b118010c1b14 Grid5000 0 6 803 777 2013-07-03T20:56:22Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make 1ffd21d9a7663e074284e37b860754d03a8f3ea5 804 803 2013-07-08T16:39:16Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): apt-get install libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make ec0bf1150f19d35f975a74239e8f568dcab19939 805 804 2013-07-08T20:04:24Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): apt-get install libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make == Installing Wrekavoc == Download from http://wrekavoc.gforge.inria.fr/ # apt-get install libxml2-dev pkg-config # tar -xzvf wrekavoc-1.1.tar.gz # cd wrekavoc-1.1/ # ./configure # make # ./src/burn 50 3da348c138438448afe25b121e700919cfbf3b6d 806 805 2013-07-09T16:27:41Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. <br> == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): apt-get install libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make == Installing Wrekavoc == Download from http://wrekavoc.gforge.inria.fr/ # apt-get install libxml2-dev pkg-config # tar -xzvf wrekavoc-1.1.tar.gz # cd wrekavoc-1.1/ # ./configure # make # ./src/burn 50 == Installing Extrae == (on grid5000 wheezy big) First install [http://www.dyninst.org/ Dyninst] # apt-get install libelf-dev libdwarf-dev # tar -xzvf # tar -xzvf DyninstAPI-8.1.2.tgz # cd DyninstAPI-8.1.2 # ./configure --with-libdwarf-static # make # make install Then Extrae # apt-get install 592a0f325fb197f413cec6870727338ca5ca0e71 807 806 2013-07-09T16:32:30Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. <br> == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): apt-get install libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make == Installing Wrekavoc == Download from http://wrekavoc.gforge.inria.fr/ # apt-get install libxml2-dev pkg-config # tar -xzvf wrekavoc-1.1.tar.gz # cd wrekavoc-1.1/ # ./configure # make # ./src/burn 50 == Installing Extrae == (on grid5000 wheezy big) First install [http://www.dyninst.org/ Dyninst] # apt-get install libelf-dev libdwarf-dev # tar -xzvf DyninstAPI-8.1.2.tgz # cd DyninstAPI-8.1.2 # ./configure --with-libdwarf-static # make # make install Then Extrae # apt-get install # ./configure --with-mpi=/usr --with-mpi-libs=/usr/lib --with-papi=/usr/local --with-unwind=/usr 2aa6c5d2b45440d7be72f360c1beb1c39e4a75c7 808 807 2013-07-22T19:16:24Z Davepc 2 wikitext text/x-wiki https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home [https://www.grid5000.fr/mediawiki/index.php/Grid5000:UserCharter USAGE POLICY]&nbsp; - Very important, after booking nodes (oarsub ...) run the command:&nbsp;<source lang="">outofchart</source>&nbsp;This will check that you haven't booked too many resources and therefore get in trouble with grid5000 admin. <br> == Login, job submission, deployment of image == *Select sites and clusters for experiments, using information on the [https://www.grid5000.fr/mediawiki/index.php/Grid5000:Network#Grid.275000_Sites Grid5000 network] and the [https://www.grid5000.fr/mediawiki/index.php/Status Status page] *Access is provided via access nodes '''access.SITE.grid5000.fr''' marked [https://www.grid5000.fr/mediawiki/index.php/External_access here] as ''accessible from '''everywhere''' via ssh with '''keyboard-interactive''' authentication method''. As soon as you are on one of the sites, you can directly ssh frontend node of any other site: <source lang="bash"> access_$ ssh frontend.SITE2 </source> *There is no access to Internet from computing nodes (external IPs should be registered on proxy), therefore, download/update your stuff at the access nodes. Several revision control clients are available. *Each site has a separate NFS, therefore, to run an application on several sites at once, you need to copy it '''scp, sftp, rsync''' between access or frontend nodes. *Jobs are run from the frondend nodes, using a [http://en.wikipedia.org/wiki/OpenPBS PBS]-like system [https://www.grid5000.fr/mediawiki/index.php/Cluster_experiment-OAR2 OAR]. Basic commands: **'''oarstat''' - queue status **'''oarsub''' - job submission **'''oardel''' - job removal Interactive job on deployed images: <source lang="bash"> fontend_$ oarsub -I -t deploy -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Batch job on installed images: <source lang="bash"> fontend_$ oarsub BATCH_FILE -t allow_classic_ssh -l [/cluster=N/]nodes=N,walltime=HH[:MM[:SS]] [-p 'PROPERTY="VALUE"'] </source> Specifying cluster name to reserve: <source lang="bash"> oarsub -r 'YYYY-MM-dd HH:mm:ss' -l nodes=2,walltime=1 -p "cluster='Genepi'" </source> If the resources are available two nodes from the cluster "Genepi" will be reserved for the specified time. *The image to deploy can be created and loaded with help of a [http://wiki.systemimager.org/index.php/Main_Page Systemimager]-like system [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2 Kadeploy]. Creating: [https://www.grid5000.fr/mediawiki/index.php/Deploy_environment-OAR2#Tune_an_environment_to_build_another_one:_customize_authentification_parameters described here] Loading: <source lang="bash"> fontend_$ kadeploy3 -a PATH_TO_PRIVATE_IMAGE_DESC -f $OAR_FILE_NODES </source> A Linux distribution lenny-x64-nfs-2.1 with mc, subversion, autotools, doxygen, MPICH2, GSL, Boost, R, gnuplot, graphviz, X11, evince is available at Orsay /home/nancy/alastovetsky/grid5000. == Compiling and running MPI applications == *Compilation should be done on one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) *Running MPI applications is described [https://www.grid5000.fr/mediawiki/index.php/Run_MPI_On_Grid%275000 here] **mpirun/mpiexec should be run from one of the reserved nodes (e.g. ssh `head -n 1 $OAR_NODEFILE`) == Setting up new deploy image == List available images kaenv3 -l Then book node and launch: oarsub -I -t deploy -l nodes=1,walltime=12 kadeploy3 -e squeeze-x64-big -f $OAR_FILE_NODES -k ssh root@`head -n 1 $OAR_NODEFILE` default password: grid5000 edit /etc/apt/sources.list apt-get update apt-get upgrade apt-get install libtool autoconf automake mc colorgcc ctags libboost-serialization-dev libboost-graph-dev libatlas-base-dev gfortran vim gdb valgrind screen subversion iperf bc gsl-bin libgsl0-dev Possibly also install (for using extrae): apt-get install libxml2-dev binutils-dev libunwind7-dev <br> Compiled for sources by us: *<strike>gsl-1.14 (download: ftp://ftp.gnu.org/gnu/gsl/)&nbsp;</strike> ''Now with squeeze it is in repository.'' <strike>./configure &amp;&amp; make &amp;&amp; make install</strike> *mpich2 (download: http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads) ./configure --enable-shared --enable-sharedlibs=gcc --with-pm=mpd make &amp;&amp; make install Mpich2 installed to: Installing MPE2 include files to /usr/local/include Installing MPE2 libraries to /usr/local/lib Installing MPE2 utility programs to /usr/local/bin Installing MPE2 configuration files to /usr/local/etc Installing MPE2 system utility programs to /usr/local/sbin Installing MPE2 man to /usr/local/share/man Installing MPE2 html to /usr/local/share/doc/ Installed MPE2 in /usr/local *hwloc (and lstopo) (download: http://www.open-mpi.org/software/hwloc/v1.2/) compile from sources. To get xml support install libxml2-dev and pkg-config apt-get install libxml2-dev pkg-config tar -xzvf hwloc-1.1.1.tar.gz cd hwloc-1.1.1 ./configure &amp;&amp; make &amp;&amp; make install Change root password. rm sources from root dir. Edit the "message of the day" vi /etc/motd.tail echo 90 &gt; /proc/sys/vm/overcommit_ratio echo 2 &gt; /proc/sys/vm/overcommit_memory date &gt;&gt; release Cleanup apt-get clean rm /etc/udev/rules.d/*-persistent-net.rules Make image ssh root@'''node''' tgz-g5k &gt; $HOME/grid5000/'''imagename'''.tgz make appropriate .env file. kaenv3 -p lenny-x64-nfs -u deploy &gt; lenny-x64-custom-2.3.env <br> == GotoBLAS2 == http://www.tacc.utexas.edu/tacc-projects/gotoblas2 When compiling gotoblas on a node without direct internet access get this error: <source lang="">wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --2011-05-19 03:11:03-- http://www.netlib.org/lapack/lapack-3.1.1.tgz Resolving www.netlib.org... 160.36.58.108 Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. --2011-05-19 03:14:13-- (try: 2) http://www.netlib.org/lapack/lapack-3.1.1.tgz Connecting to www.netlib.org|160.36.58.108|:80... failed: Connection timed out. Retrying. ...</source> Fix by downloading http://www.netlib.org/lapack/lapack-3.1.1.tgz to the GotoBLAS2 source directory and editing this line in the Makefile 184c184 &lt; -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz --- &gt; # -wget http://www.netlib.org/lapack/lapack-3.1.1.tgz <br> GotoBLAS needs to be compiled individualy for each unique machine - ie each cluster. Add the following to .bashrc export CLUSTER=`hostname |sed 's/\([a-z]*\).*/\1/'` LD_LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LD_LIBRARY_PATH export LIBRARY_PATH=$HOME/lib/$CLUSTER:$HOME/lib:/usr/local/lib:$LIBRARY_PATH Run the following script once on each cluster: <source lang="bash">#! /bin/bash echo "Compiling gotoblas for cluster: $CLUSTER" cd $HOME/src if [ ! -d "$CLUSTER" ]; then mkdir $CLUSTER fi cd $CLUSTER tar -xzf ../Goto*.tar.gz cd Goto* make &> m.log if [ ! -d "$HOME/lib/$CLUSTER" ]; then mkdir $HOME/lib/$CLUSTER fi cp libgoto2.so $HOME/lib/$CLUSTER echo results ls -d $HOME/src/$CLUSTER ls $HOME/src/$CLUSTER ls -d $HOME/lib/$CLUSTER ls $HOME/lib/$CLUSTER</source> note: for newer processors this may fail. If it is a NEHALEM processor try: make clean make TARGET=NEHALEM == Paging and the OOM-Killer == When doing exhaustion of available memory experiments, problems can occur with over-commit. See [[HCL cluster#Paging_and_the_OOM-Killer]] for more detail. == Example of experiment setup across several sites == Sources of all files mentioned below is available at: [[Grid5000:sources]]. Pick one head node as the main head node (I use grenoble, but any will do). Setup sources cd dave/fupermod-1.1.0 make clean ./configure --with-cblas=goto --prefix=/usr/local/ Reserve 2 nodes from all clusters on a 3 cluster site: oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=3/nodes=2,walltime=11:59:00 Automate with: for a in 2 3 4; do for i in `cat sites.$a`; do echo $a $i; ssh $i oarsub -r "2011-07-25 11:01:01" -t deploy -l cluster=$a/nodes=2,walltime=11:59:00; done; done Then on each site, where xxx is site name: kadeploy3 -a $HOME/grid5000/lenny-dave.env -f $OAR_NODE_FILE --output-ok-nodes deployed.xxx Gather deployed files to a head node: for i in `cat ~/sites `; do echo $i; scp $i:deployed* .&nbsp;; done cat deployed.* &gt; deployed.all Copy cluster specific libs to each deployed node /usr/local/lib dir with script copy_local_libs.sh deployed.all Copy source files to root dir of each deployed node. Then make install each (node ssh -f does this in parallel) for i in `cat ~/deployed.all`; do echo $i; rsync -aP ~/dave/fupermod-1.1.0 root@$i:&nbsp;; done for i in `cat ~/deployed.all`; do echo $i; ssh -f root@$i "cd fupermod-1.1.0&nbsp;; make all install"&nbsp;; done ssh to the first node ssh `head -n1 deployed.all` n=$(cat deployed.all |wc -l) mpdboot --totalnum=$n --file=$HOME/deployed.all mpdtrace cd dave/data/ mpirun -n $n /usr/local/bin/partitioner -l /usr/local/lib/libmxm_col.so -a0 -D10000 -o N=100 Cleanup after: for i in `cat ~/sites `; do echo $i; ssh $i rm deployed.*&nbsp;; done == Check network speed == apt-get install iperf == Choose which network interface to use == mpirun --mca btl self,openib ... or mpirun --mca btl self,tcp ... == Installing Gadget-2.0.7 == # apt-get install hdf5-openmpi-dev sfftw-dev $ tar -xzvf gadget2.tar.gz $ cd Gadget-2.0.7/Gadget2 $ make CFLAGS="-DH5_USE_16_API $ make clean; make == Installing Wrekavoc == Download from http://wrekavoc.gforge.inria.fr/ # apt-get install libxml2-dev pkg-config # tar -xzvf wrekavoc-1.1.tar.gz # cd wrekavoc-1.1/ # ./configure # make # ./src/burn 50 == Installing Extrae == (on grid5000 wheezy big) First install [http://www.dyninst.org/ Dyninst] # apt-get install libelf-dev libdwarf-dev # tar -xzvf DyninstAPI-8.1.2.tgz # cd DyninstAPI-8.1.2 # ./configure --with-libdwarf-static # make # make install Then Extrae # apt-get install # ./configure --with-mpi=/usr --with-mpi-libs=/usr/lib --with-papi=/usr/local --with-unwind=/usr --with-dyninst=/usr/local --with-dwarf=/usr ce1e48ce4c4494317f684096589c0c741651648c HCL cluster 0 5 809 713 2013-09-09T13:28:22Z Davepc 2 wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == If PBS jobs do not start after a reboot of heterogeneous.ucd.ie it may be necessary to manually start maui: /usr/local/maui/sbin/maui ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * libatlas-base-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Please read [[virtual memory overcommit]] for details. Due to the nature of experiments run on the cluster, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 008de2fcc9efac0a58a7da4648d0587f81b56bca 813 809 2013-09-09T13:53:24Z Davepc 2 wikitext text/x-wiki == General Information == [[Image:Cluster.jpg|right|thumbnail||HCL Cluster]] [[Image:network.jpg|right|thumbnail||Layout of the Cluster]] The hcl cluster is heterogeneous in computing hardware & network ability. Nodes are from Dell, IBM, and HP, with Celeron, Pentium 4, Xeon, and AMD processors ranging in speeds from 1.8 to 3.6Ghz. Accordingly architectures and parameters such as Front Side Bus, Cache, and Main Memory all vary. Operating System used is Debian “squeeze” with Linux kernel 2.6.32. The network hardware consists of two Cisco 24+4 port Gigabit switches. Each node has two Gigabit ethernet ports - each eth0 is connected to the first switch, and each eth1 is connected to the second switch. The switches are also connected to each other. The bandwidth of each port can be configured to meet any value between 8Kb/s and 1Gb/s. This allows testing on a very large number of network topologies, As the bandwidth on the link connecting the two switches can also be configured, the cluster can actually act as two separate clusters connected via one link. The diagram shows a schematic of the cluster. === Detailed Cluster Specification === * [[HCL Cluster Specifications]] * [[Old HCL Cluster Specifications]] (pre May 2010) === Documentation === * [[media:PE750.tgz|Dell Poweredge 750 Documentation]] * [[media:SC1425.tgz|Dell Poweredge SC1425 Documentation]] * [[media:X306.pdf|IBM x-Series 306 Documentation]] * [[media:E326.pdf|IBM e-Series 326 Documentation ]] * [[media:Proliant100SeriesGuide.pdf|HP Proliant DL-140 G2 Documentation]] * [[media:ProliantDL320G3Guide.pdf|HP Proliant DL-320 G3 Documentation]] * [[media:Cisco3560Specs.pdf|Cisco Catalyst 3560 Specifications]] * [[media:Cisco3560Guide.pdf|Cisco Catalyst 3560 User Guide]] * [[HCL Cluster Network]] == Cluster Administration == If PBS jobs do not start after a reboot of heterogeneous.ucd.ie it may be necessary to manually start maui: /usr/local/maui/sbin/maui ===Useful Tools=== <code>root</code> on <code>heterogeneous.ucd.ie</code> has a number of [http://expect.nist.gov/ Expect] scripts to automate administration on the cluster (in <code>/root/scripts</code>). <code>root_ssh</code> will automatically log into a host, provide the root password and either return a shell to the user or execute a command that is passed as a second argument. Command syntax is as follows: <source lang="text"> # root_ssh usage: root_ssh [user@]<host> [command] </source> Example usage, to login and execute a command on each node in the cluster (note the file <code>/etc/dsh/machines.list</code> contains the hostnames of all compute nodes of the cluster): # for i in `seq -w 1 16`; do root_ssh hcl$i ps ax \| grep pbs; done The above is sequential. To run parallel jobs, for example: <code>apt-get update && apt-get -y upgrade</code>, try the following trick with [http://www.gnu.org/software/screen/ screen]: # for i in `seq -w 1 16`; do screen -L -d -m root_ssh hcl$i apt-get update \&\& apt-get -y upgrade; done You can check the screenlog.* files for errors and delete them when you are happy. Sometimes all logs are sent to screenlog.0, not sure why. == Software packages available on HCL Cluster 2.0 == Wit a fresh installation of operating systems on HCL Cluster the follow list of packages are avalible: * autoconf * automake * gcc * ctags * cg-vg * fftw2 * git * gfortran * gnuplot * libtool * netperf * octave3.2 * qhull * subversion * valgrind * gsl-dev * vim * python * mc * openmpi-bin * openmpi-dev * evince * libboost-graph-dev * libboost-serialization-dev * libatlas-base-dev * r-cran-strucchange * graphviz * doxygen * colorgcc [[HCL_cluster/hcl_node_install_configuration_log|new hcl node install & configuration log]] [[HCL_cluster/heterogeneous.ucd.ie_install_log|new heterogeneous.ucd.ie install log]] ===APT=== To do unattended updates on cluster machines you need to specify some environment variables and switches to apt-get: export DEBIAN_FRONTEND=noninteractive apt-get -q -y upgrade NOTE: on hcl01 and hcl02 any updates to grub will force a prompt, despite the switches above. This happens because there are two disks on these machines and grub asks which it should install itself on. == Access and Security == All access and security for the cluster is handled by the gateway machine (heterogeneous.ucd.ie). This machine is not considered a compute node and should not be used as such. The only new incoming connections allowed are ssh, other incoming packets such as http that are responding to requests from inside the cluster (established or related) are also allowed. Incoming ssh packets are only accepted if they are originating from designated IP addresses. These IP's must be registered ucd IP's. csserver.ucd.ie is allowed, as is hclgate.ucd.ie, on which all users have accounts. Thus to gain access to the cluster you can ssh from csserver, hclgate or other allowed machines to heterogeneous. From there you can ssh to any of the nodes (hcl01-hcl16) that you are running a pbs job on. Access from outside the UCD network is only allowed once you have gained entry to a server that allows outside connections (such as csserver.ucd.ie) === Creating new user accounts === As root on heterogeneous run: adduser <username> make -C /var/yp === Access to the nodes is controlled by Torque PBS.=== Use qsub to submit a job, -I is for an interactive session, walltime is time required. qsub -I -l walltime=1:00 \\ Reserve 1 node for 1 hour qsub -l nodes=hcl01+hcl07,walltime=1:00 myscript.sh Example Script: #!/bin/sh #General Script # # #These commands set up the Grid Environment for your job: #PBS -N JOBNAME #PBS -l walltime=48:00:00 #PBS -l nodes=16 #PBS -m abe #PBS -k eo #PBS -V echo foo So see the queue qstat -n showq To remove your job qdel JOBNUM More info: [http://www.clusterresources.com/products/torque/docs/] == Some networking issues on HCL cluster (unsolved) == "/sbin/route" should give: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 239.2.11.72 * 255.255.255.255 UH 0 0 0 eth0 heterogeneous.u * 255.255.255.255 UH 0 0 0 eth0 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 192.168.20.0 * 255.255.255.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 192.168.20.0 * 255.255.254.0 U 0 0 0 eth1 default heterogeneous.u 0.0.0.0 UG 0 0 0 eth0 For reasons unclear, sometimes many machines miss the entry: 192.168.21.0 * 255.255.255.0 U 0 0 0 eth1 For Open MPI, this leads to inability to do a system sockets "connect" call to any 192.*.21.* address (hangup). For this case, you can * switch off eth1 (see also [http://hcl.ucd.ie/wiki/index.php/OpenMPI] ): mpirun --mca btl_tcp_if_exclude lo,eth1 ... or * you can restore the above table on all nodes by running "sh /etc/network/if-up.d/00routes" as root It is not yet clear why without this entry the connection to the "21" addresses can't be connected. We expect that in this case following rule should be matched (because of the mask): 192.168.20.0 * 255.255.254.0 U 0 0 0 eth0 The packets leave over the eth0 network interface then and should go over switch1 to switch2 and eth1 interface of the corresponding node * If one attempts a ping from one node A, via its eth0 interface, to the address of another node's (B) eth1 interface, the following is observed: ** outgoing ping packets appear only on the eth0 interface of the first node A. ** incoming ping packets appear only on eth1 interface of the second node B. ** outgoing ping response packets appear on the eth0 interface of the second node B, never on the eth1 interface despite pinging the eth1 address specifically. What explains this? With the routing tables as they are above, or in the damaged case, the ping may arrive to the correct interface, but the response from B is routed to A-eth0 via B-eth0. Further, after a number of ping packets have been sent in sequence (50 to 100), pings from A, though the -i eth0 switch is specified, begin to appear on both A-eth0 and A-eth1. This behaviour is unexpected, but does not effect the return path of the ping response packet. In order to get a symmetric behaviour, where a packet leaves A-eth0, travels via the switch bridge to B-eth1 and returns back from B-eth1 to A-eth0, one must ensure the routing table of B contains no eth0 entries. == Paging and the OOM-Killer == Please read [[Virtual Memory Overcommit]] page for details. For reasons given overcommit has been disabled on the cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == Memory can be manually limited in the grub. This should be reset when you are finished. If you are doing memory exaustive experiments test check this has not being adjusted by someone else. See [[Memory size, overcommit, limit]] for more detail. 6dd96ff4d2718805d1c86643d0bbb8f71130e9ec Memory size, overcommit, limit 0 96 810 2013-09-09T13:29:46Z Davepc 2 Created page with "== Paging and the OOM-Killer == Due to the nature of experiments our group runs, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux …" wikitext text/x-wiki == Paging and the OOM-Killer == Due to the nature of experiments our group runs, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the HCL cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot d3bb8ae165a97fd44207cdf95d7457a4a17b8278 811 810 2013-09-09T13:50:10Z Davepc 2 moved [[Virtual memory overcommit]] to [[Memory size, overcommit, limit]] wikitext text/x-wiki == Paging and the OOM-Killer == Due to the nature of experiments our group runs, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the HCL cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 > /proc/sys/vm/overcommit_memory # echo 50 > /proc/sys/vm/overcommit_ratio == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot d3bb8ae165a97fd44207cdf95d7457a4a17b8278 815 811 2013-09-09T13:58:53Z Davepc 2 wikitext text/x-wiki == Paging and the OOM-Killer == Due to the nature of experiments our group runs, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the HCL cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 &gt; /proc/sys/vm/overcommit_memory # echo 50 &gt; /proc/sys/vm/overcommit_ratio Run the following two programmes are from here: [http://www.win.tue.nl/~aeb/linux/lk/lk-9.html www.win.tue.nl/~aeb/linux/lk/lk-9.html] to test results. Demo program 1: allocate memory without using it. <source lang="">#include <stdio.h> #include <stdlib.h> int main (void) { int n = 0; while (1) { if (malloc(1<<20) == NULL) { printf("malloc failure after %d MiB\n", n); return 0; } printf ("got %d MiB\n", ++n); } }</source><br> Demo program 2: allocate memory and actually touch it all. <source lang="">#include <stdio.h> #include <string.h> #include <stdlib.h> int main (void) { int n = 0; char *p; while (1) { if ((p = malloc(1<<20)) == NULL) { printf("malloc failure after %d MiB\n", n); return 0; } memset (p, 0, (1<<20)); printf ("got %d MiB\n", ++n); } }</source> == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot b3259991799883d21ebf1194e0691d78a59fb870 816 815 2013-09-09T13:59:20Z Davepc 2 wikitext text/x-wiki == Paging and the OOM-Killer == Due to the nature of experiments our group runs, we often induce heavy paging and complete exhaustion of available memory on certain nodes. Linux has a pair of strategies to deal with heavy memory use. First, is overcommitting. This is where a process is allowed allocate or fork even when there is no more memory available. You can seem some interesting numbers here:[http://opsmonkey.blogspot.com/2007/01/linux-memory-overcommit.html]. The assumption is that processes may not use all memory that they allocate and failing on allocation is worse than failing at a later date when the memory use is actually required. More processes may be supported by allowing them to allocate memory (provided they do not use it all). The second part of the strategy is the Out-of-Memory killer (OOM Killer). When memory has been exhausted and a process tries to use some 'overcommitted' part of memory, the OOM killer is invoked. It's job is to rank all processes in terms of their memory use, priority, privilege and some other parameters, and then select a process to kill based on the ranks. The argument for using overcommital+OOM Killer is that rather than failing to allocate memory for some random unlucky process, which as a result would probably terminate, the kernel can instead allow the unlucky process to continue executing and then make a some-what-informed decision on which process to kill. Unfortunately, the behaviour of the OOM-killer sometimes causes problems which grind the machine to a complete halt, particularly when it decides to kill system processes. There is a good discussion on the OOM-killer here: [http://lwn.net/Articles/104179/] For this reason overcommit has been disabled on the HCL cluster. cat /proc/sys/vm/overcommit_memory 2 cat /proc/sys/vm/overcommit_ratio 100 To restore to default overcommit # echo 0 &gt; /proc/sys/vm/overcommit_memory # echo 50 &gt; /proc/sys/vm/overcommit_ratio <br> Run the following two programmes from here: [http://www.win.tue.nl/~aeb/linux/lk/lk-9.html www.win.tue.nl/~aeb/linux/lk/lk-9.html] to test results. Demo program 1: allocate memory without using it. <source lang="">#include <stdio.h> #include <stdlib.h> int main (void) { int n = 0; while (1) { if (malloc(1<<20) == NULL) { printf("malloc failure after %d MiB\n", n); return 0; } printf ("got %d MiB\n", ++n); } }</source><br> Demo program 2: allocate memory and actually touch it all. <source lang="">#include <stdio.h> #include <string.h> #include <stdlib.h> int main (void) { int n = 0; char *p; while (1) { if ((p = malloc(1<<20)) == NULL) { printf("malloc failure after %d MiB\n", n); return 0; } memset (p, 0, (1<<20)); printf ("got %d MiB\n", ++n); } }</source> == Manually Limit the Memory on the OS level == as root edit /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="quiet mem=128M" then run the command update-grub reboot 10655afed3f9ee151d1fc36269050cd78061fd7e Virtual memory overcommit 0 97 812 2013-09-09T13:50:10Z Davepc 2 moved [[Virtual memory overcommit]] to [[Memory size, overcommit, limit]] wikitext text/x-wiki #REDIRECT [[Memory size, overcommit, limit]] 496c7121d4c831d0e05a5f66bb08910045659eb4 HCL cluster/hcl node install configuration log 0 49 817 538 2013-09-26T16:43:18Z Davepc 2 /* Complications */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 apt-get update be39d73ecf833a216c7ccd09818ad87b9415d4ea 818 817 2013-09-26T17:50:34Z Davepc 2 wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade 56c8e0105e27bc167e8cf9926af39b4f9837a6df 819 818 2013-09-26T18:03:48Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade ==Installation of PAPI== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test make install ==Installation of Extrae get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure 67e11d4faa050aacd11be6f3e402515c656f4c0d 820 819 2013-09-26T18:07:04Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot ==Installation of PAPI== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test make install ==Installation of Extrae get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure f8f11adb7b60348ae7eea4f5ac24d16525e22b58 821 820 2013-09-26T18:08:23Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot ==Installation of PAPI== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test make install ==Installation of Extrae get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure f92869da55b56b7b4bf6ca0f1525b2454ed0b687 822 821 2013-09-26T18:15:31Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot Questions: Q:Configuration file `/etc/pam.d/sshd' modified (by you or by a script) since installation. A:keep your currently-installed version Q:A new version of configuration file /etc/default/nfs-common is available A:Keep installed version Q:A new version of configuration file /etc/default/grub is available A:Install package managers version. ==Installation of PAPI== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test make install ==Installation of Extrae get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure --with-mpi=/usr/lib/openmpi --with-unwind=/usr --without-dyninst e956bc9e382a92b17d6e2a39e4e98e89463d76d8 823 822 2013-09-26T19:38:25Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot Questions: Q:Configuration file `/etc/pam.d/sshd' modified (by you or by a script) since installation. A:keep your currently-installed version Q:A new version of configuration file /etc/default/nfs-common is available A:Keep installed version Q:A new version of configuration file /etc/default/grub is available A:Install package managers version. ==Installation of PAPI== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test make install make test Make test failed. Hardware too old, would need to patch kernel. Installing Papi on HCL abandoned for now. ==Installation of Extrae apt-get install apt-get install libunwind7-dev apt-get install binutils-dev get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure --with-mpi=/usr/lib/openmpi --with-unwind=/usr --without-dyninst --without-papi Note - this is a lot of good features of extrae turned off because if missing packages. f2222490756fcf21ad19f00ee3b5dcbd2177f12a 824 823 2013-09-26T19:42:26Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot Questions: Q:Configuration file `/etc/pam.d/sshd' modified (by you or by a script) since installation. A:keep your currently-installed version Q:A new version of configuration file /etc/default/nfs-common is available A:Keep installed version Q:A new version of configuration file /etc/default/grub is available A:Install package managers version. ==Installation of PAPI - failed== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test Make test failed. Hardware too old, would need to patch kernel. Abandoned installing Papi on HCL for now. ==Installation of Extrae apt-get install apt-get install libunwind7-dev apt-get install binutils-dev get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure --with-mpi=/usr/lib/openmpi --with-unwind=/usr --without-dyninst --without-papi --prefix=/usr/local Note - this is a lot of good features of extrae turned off because if missing packages. make make install 29a167f05f9784e4676cb131d21f39860f733515 825 824 2013-09-26T19:43:58Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot Had to manually boot hcl12 hcl13 becuase of tempiture warning. apt-get autoremove Questions: Q:Configuration file `/etc/pam.d/sshd' modified (by you or by a script) since installation. A:keep your currently-installed version Q:A new version of configuration file /etc/default/nfs-common is available A:Keep installed version Q:A new version of configuration file /etc/default/grub is available A:Install package managers version. ==Installation of PAPI - failed== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test Make test failed. Hardware too old, would need to patch kernel. Abandoned installing Papi on HCL for now. ==Installation of Extrae apt-get install apt-get install libunwind7-dev apt-get install binutils-dev get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure --with-mpi=/usr/lib/openmpi --with-unwind=/usr --without-dyninst --without-papi --prefix=/usr/local Note - this is a lot of good features of extrae turned off because if missing packages. make make install 591e87d1a8aba1794b96ce84a3e5f0b272cbe148 826 825 2013-09-26T21:33:01Z Davepc 2 /* Upgrade September 2013 */ wikitext text/x-wiki HCL Nodes will be installed from a clone of a root node, <code>hcl07</code>. The general installation of the root is documented here. There are a number of complications as a result of the cloning process. Solutions to these complications are also explained. See also [[HCL_cluster/heterogeneous.ucd.ie_install_log]] =General Installation= Partition filesystem with swap at the end of the disk, size 1GB, equal to maximum of the installed memory on cluster nodes. Root file system occupies the remainder of the disk, EXT4 format. Install long list of packages. ==Networking== Configure network interface as follows: <source lang="text"># This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo eth0 eth1 iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet dhcp allow-hotplug eth1 iface eth1 inet dhcp </source> ===Routing Tables=== Add an executable script named <code>00routes</code> to the directory <code>/etc/network/if-up.d</code>. This script will be called after the interfaces listed as <code>auto</code> are brought up on boot (or networking service restart). Note: the DHCP process for eth1 is backgrounded so that the startup of other services can continue. Our routing script adds routes on this interface even though it is not yet fully up. The script outputs some errors but the routing entries remain none-the-less. The script should read as follows: <source lang="bash"> #!/bin/sh # Static Routes # route ganglia broadcast t route add -host 239.2.11.72 dev eth0 # all traffic to heterogeneous gate goes through eth0 route add -host 192.168.20.254 dev eth0 # all subnet traffic goes through specific interface route add -net 192.168.20.0 netmask 255.255.255.0 dev eth0 route add -net 192.168.21.0 netmask 255.255.255.0 dev eth1 </source> The naming of the script is important, we want our routes in place before other scripts in the <code>/etc/network/if-up.d</code> directory are executed, the order in which they are executed is alphabetical. ===Hosts=== Change the hosts file so that it does not list the node's hostname, otherwise this would confuse nodes that are cloned from this image. <source lang="text"> 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters </source> ==Ganglia== Install the ganglia-monitor package. Configure ganglia monitor by editing <code>/etc/ganglia/gmond.conf</code> so that it contains: <source lang="text">cluster { name = "HCL Cluster" owner = "University College Dublin" latlong = "unspecified" url = "http://hcl.ucd.ie/" } </source> And ... <source lang="text"> /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.72 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.72 port = 8649 bind = 239.2.11.72 } </source> After all packages are complete execute: <source lang="text"> service ganglia-monitor restart </source> ==NIS Client== Install nis package. Set <code>/etc/defaultdomain</code> to contain <code>heterogeneous.ucd.ie</code> Make sure the NIS Server has an entry in <code>/etc/hosts</code>, this is because DNS may not be active when the NIS client is starting and we want to ensure that it connects to the server successful. 192.168.21.254 heterogeneous.ucd.ie heterogeneous Make sure the file <code>/etc/nsswitch.conf</code> contains: passwd: compat group: compat shadow: compat Append to the file: <code>/etc/passwd</code> the line <code>+::::::</code> Append to the file: <code>/etc/group</code> the line <code>+:::</code> Append to the file: <code>/etc/shadow</code> the line <code>+::::::::</code> Edit <code>/etc/yp.conf</code> (this wasn't need before, it is now with raid server) ypserver 192.168.20.254 Start the nis service: service nis start Check that nis is operating correctly by running the following command: ypcat passwd ==NFS== apt-get install nfs-common portmap Add the line to <code>/etc/fstab</code> 192.168.20.254:/home /home nfs soft,retrans=6 0 0 Set in <code>/etc/default/nfs-common</code> (this wasn't need before, it is now with raid server) NEED_IDMAPD=yes Then: service nfs-common restart mount /home ==Torque PBS== First install PBS on headnode [[New_heterogeneous.ucd.ie_install_log#Packages_for_nodes|explained here]]. Then: ./torque-package-mom-linux-i686.sh --install ./torque-package-clients-linux-i686.sh --install update-rc.d pbs_mom defaults service pbs_mom start ==NTP== Install NTP software: apt-get install ntp Edit configuration and make <code>server heterogeneous.ucd.ie</code> the sole server entry. Comment out any other servers. Restart the NTP service. =Complications= ==Hostnames== Debian does not pull the hostname from the DHCP Server. Without intervention cloned nodes will keep the hostname stored on the image of the root node. A bug describing the setting of a hostname via DHCP is described [https://bugs.launchpad.net/ubuntu/+source/dhcp3/+bug/90388 here] The solution we will use is to add the file <code>/etc/dhcp3/dhclient-exit-hooks.d/hostname</code> with the contents: <source lang="bash"> if [[ -n $new_host_name ]]; then echo "$new_host_name" > /etc/hostname /bin/hostname $new_host_name fi </source> The effect of this is to set the hostname of the machine after an interface is configured using dhclient (DHCP Client). Note, the hostname of the machine will be set by the last interface that is configured via DHCP, in the current configuration that will be <code>eth0</code>. If an interface is reconfigured using dhclient, the hostname will be reset to the name belonging to that interface. Further, the current hostnames for the second interface on nodes <code>eth1</code> are '''invalid'''. They follow the format hcl??_eth1.ucd.ie, however the '_' character is not permitted in hostnames and attempting to set such a hostname fails. ==udev and Network Interfaces== The udev system attempts to keep network interface names consistent regardless of changing hardware. This may be useful for laptops with wirless cards that a plugged in and out, but it causes problems when trying to install our root node image across all machines in the cluster. A description of the problem can be read [http://www.ducea.com/2008/09/01/remove-debian-udev-persistent-net-rules/ here]. The solution is to remove the udev rules for persistent network interfaces, and disable the generator script for these rules. On the root cloning node do the following #remove the file <code>/etc/udev/rules.d/70-persistent-net.rules</code> #and to the top of the file: <code>/lib/udev/rules.d/75-persistent-net-generator.rules</code>, the following lines: <source lang="text"># skip generation of persistent network interfaces ACTION=="*", GOTO="persistent_net_generator_end"</source> ==Sysstat== [http://pagesperso-orange.fr/sebastien.godard/ Sysstat] is a useful suite of tools for measuring performance of different system components. Unfortunately it adds some un-useful cron entries for collecting a historic set of system performance data. Though these cron entries point to disabled scripts, we will disable them none the less. Edit the file <code>/etc/cron.d/sysstat</code> and comment out all lines. <source lang="text"> # The first element of the path is a directory where the debian-sa1 # script is located #PATH=/usr/lib/sysstat:/usr/sbin:/usr/sbin:/usr/bin:/sbin:/bin # Activity reports every 10 minutes everyday #5-55/10 * * * * root command -v debian-sa1 > /dev/null && debian-sa1 1 1 # Additional run at 23:59 to rotate the statistics file #59 23 * * * root command -v debian-sa1 > /dev/null && debian-sa1 60 2 </source> =Upgrade September 2013= Upgrade motivated by needing a kernel newer than 2.6.35 for PAPI Before: uname -a Linux hcl03.heterogeneous.ucd.ie 2.6.32-5-686 #1 SMP Wed Jan 12 04:01:41 UTC 2011 i686 GNU/Linux After: Linux hcl03.heterogeneous.ucd.ie 3.2.0-4-686-pae #1 SMP Debian 3.2.46-1+deb7u1 i686 GNU/Linux Nodes upgraded (11): hcl01 hcl03 hcl05 hcl06 hcl10 hcl11 hcl12 hcl13 hcl14 hcl15 hcl16 Nodes not upgraded because unbootable before upgrade (5) hcl02 hcl04 hcl07 hcl08 hcl09 vi /etc/apt/sources.list replace squeeze with wheezy In parallel on all with Cluster ssh (cssh) apt-get update apt-get dist-upgrade reboot Had to manually boot hcl12 hcl13 becuase of tempiture warning. apt-get autoremove NFS not starting on boot, so added to /etc/rc.local mount /home Changed /etc/fstab line to: 192.168.20.254:/home /home nfs rw,rsize=4096,wsize=4096,hard,intr 0 0 ==Questions:== Q:Configuration file `/etc/pam.d/sshd' modified (by you or by a script) since installation. A:keep your currently-installed version Q:A new version of configuration file /etc/default/nfs-common is available A:Keep installed version Q:A new version of configuration file /etc/default/grub is available A:Install package managers version. ==Problems== On every package getting the following warnings: ldconfig: Can't link /opt/lib//opt/lib/libf77blas.so to libf77blas.so ldconfig: Can't link /opt/lib//opt/lib/libcblas.so to libcblas.so ldconfig: Can't link /opt/lib//opt/lib/libatlas.so to libatlas.so ==Installation of PAPI - failed== get papi-5.2.0.tar.gz tar -xzvf papi-5.2.0.tar.gz cd papi-5.2.0/src/ ./configure --prefix=/usr/local make make test Make test failed. Hardware too old, would need to patch kernel. Abandoned installing Papi on HCL for now. ==Installation of Extrae apt-get install apt-get install libunwind7-dev apt-get install binutils-dev get extrae-2.4.0.tar.bz2 tar -xjvf extrae-2.4.0.tar.bz2 cd extrae-2.4.0 ./configure --with-mpi=/usr/lib/openmpi --with-unwind=/usr --without-dyninst --without-papi --prefix=/usr/local Note - this is a lot of good features of extrae turned off because if missing packages. make make install 4e657fd6cd7ff8b9ab4843b034e5c2c672f1403b